PR

【機械学習 検証】競馬AIでは芝とダートは分けて学習すべきか?

データサイエンス
この記事は約254分で読めます。
スポンサーリンク
スポンサーリンク

はじめに¶

私は競馬予想AIの開発をしています。動画で制作過程の解説をしています。良ければ見ていってください。

また、共有するソースの一部は有料のものを使ってます。
同じように分析したい方は、以下の記事から入手ください。

競馬予想プログラムソフト 全公開!
競馬予想プログラムソフトの開発をしている者です。今回は第一弾から第四弾記事の総集編になります。 チャンネル登録で1,000円引きで入手で…
 

4.セカンドモデル作成 準備

 

今回からセカンドモデルの作成に入る
前回までの血統の分析結果を考慮したモデルの作成に取り組む

しかし、セカンドモデルの作成前にファーストモデルではデモとして作った一面があり、無理のある前提を仮定しているなど単純に問題設定に間違いがある

そのため、ここでは血統情報を考慮したモデルを作る前に、準備編としてファーストモデルの作り直しを行う。

 

今日やること

  1. ファーストモデルの作り直し
  2. 芝とダートでモデルを分けた場合の比較
 

話の流れ

  1. 下準備
  2. ファーストモデルの設定
  3. ファーストモデルの学習
  4. 性能の確認(WEBアプリ起動)
  5. 芝とダートを分ける方法
  6. 芝,ダート別版モデルの学習
  7. 結論
 
スポンサーリンク

4-0.下準備¶

 

ソースの一部は有料のものを使ってます。
同じように分析したい方は、以下の記事から入手ください。

競馬予想AI 統合分析プログラムの記事

 

今回は血統データは不用なので、
必要なモジュールのインポートから
モデル作成用のデータの読み込みまで行う

In [1]:
import pathlib
import warnings
import lightgbm as lgbm

import sys
sys.path.append(".")
sys.path.append("..")
from src.model_manager.lgbm_manager import LightGBMDataset  # noqa
from src.model_manager.lgbm_manager import LightGBMModelManager  # noqa
from src.core.meta.bet_name_meta import BetName  # noqa
from src.data_manager.preprocess_tools import DataPreProcessor  # noqa
from src.data_manager.data_loader import DataLoader  # noqa


warnings.filterwarnings("ignore")

root_dir = pathlib.Path(".").absolute().parent

start_year = 2000  # DBが持つ最古の年を指定
split_year = 2014  # 学習対象期間の開始年を指定
target_year = 2019  # テスト対象期間の開始年を指定
end_year = 2023  # テスト対象期間の終了年を指定 (当然DBに対象年のデータがあること)

# 各種インスタンスの作成
data_loader = DataLoader(
    start_year,
    end_year,
    dbpath=root_dir / "data" / "keibadata.db"  # dbpathは各種環境に合わせてパスを指定してください。絶対パス推奨
)

dataPreP = DataPreProcessor()

df = data_loader.load_racedata()
df = dataPreP.exec_pipeline(df)
 
2024-08-10 12:57:11.300 | INFO     | src.data_manager.data_loader:load_racedata:23 - Get Year Range: 2000 -> 2023.
2024-08-10 12:57:11.301 | INFO     | src.data_manager.data_loader:load_racedata:24 - Loading Race Info ...
2024-08-10 12:57:13.243 | INFO     | src.data_manager.data_loader:load_racedata:26 - Loading Race Data ...
2024-08-10 12:57:32.115 | INFO     | src.data_manager.data_loader:load_racedata:28 - Merging Race Info and Race Data ...
 
2024-08-10 12:57:34.422 | INFO     | src.data_manager.preprocess_tools:__0_check_use_save_checkpoints:35 - Start PreProcess #0 ...
2024-08-10 12:57:34.425 | INFO     | src.data_manager.preprocess_tools:__1_exec_all_sub_prep1:38 - Start PreProcess #1 ...
2024-08-10 12:57:41.420 | INFO     | src.data_manager.preprocess_tools:__2_exec_all_sub_prep2:40 - Start PreProcess #2 ...
2024-08-10 12:57:56.039 | INFO     | src.data_manager.preprocess_tools:__3_convert_type_str_to_number:42 - Start PreProcess #3 ...
2024-08-10 12:58:00.261 | INFO     | src.data_manager.preprocess_tools:__4_drop_or_fillin_none_data:44 - Start PreProcess #4 ...
2024-08-10 12:58:04.361 | INFO     | src.data_manager.preprocess_tools:__5_exec_all_sub_prep5:46 - Start PreProcess #5 ...
2024-08-10 12:58:29.183 | INFO     | src.data_manager.preprocess_tools:__6_convert_label_to_rate_info:48 - Start PreProcess #6 ...
2024-08-10 12:58:40.926 | INFO     | src.data_manager.preprocess_tools:__7_convert_distance_to_smile:50 - Start PreProcess #7 ...
2024-08-10 12:58:41.173 | INFO     | src.data_manager.preprocess_tools:__8_category_encoding:52 - Start PreProcess #8 ...
2024-08-10 12:58:46.586 | INFO     | src.data_manager.preprocess_tools:__9_convert_raceClass_to_grade:54 - Start PreProcess #9 ...
 

準備完了

 
スポンサーリンク

4-1.ファーストモデルの設定¶

 

4-1-1.前提事項一覧¶

 

機械学習モデル

LightGBM
学習タスク
2値分類
問題設定
1着になるかどうか
学習期間
2010年~2022年12月
検証期間
2010年~20223年6月
予測年度
2019年~2023年12月
 

4-1-2.モデル説明¶

 

モデル作成の動機¶

全ての基準になるモデルが欲しい。
改良した際に最低限勝つべき対象とする。

モデルの目的¶

1着になる競走馬の確率を出したい

確認したい仮説¶

以下の確認
基本的な情報(調べれば手に入る当日の情報)から1着になる馬を予測できるかどうか

但し書き(2024年8月時点)¶

ただし、オッズは最終オッズを使用する。
オッズの分析はおいおい行うので、一旦は理論値という意味合いで最終オッズを用いる。

特徴量¶

オッズ,人気,馬場,距離,距離カテゴリ,競馬場,馬場状態,
馬番,枠番,馬齢,斤量,馬体重,体重増減,出走間隔,
レースグレード,性別,騎手ID,調教師ID,馬ID

目的変数¶

1着なら1, そうでないなら0のバイナリ

 
スポンサーリンク

4-2.ファーストモデルの学習¶

 

4-2-0.機械学習モデル作成時に必要な手順¶

 
  1. データ準備
  2. データセット作成
  3. 学習実行
  4. 結果の確認
  5. モデルエクスポート
 

4-2-1.データの準備¶

 

モデル学習用インスタンス作成¶

In [3]:
lgbm_model_manager = LightGBMModelManager(
    # modelsディレクトリ配下に作成したいモデル名のフォルダパスを指定。
    # フォルダパスは絶対パスにすると安全です。
    root_dir / "models" / "first_model",
    split_year,
    target_year,
    end_year
)
 
2024-08-10 14:36:38.327 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_type, val=lightGBM
2024-08-10 14:36:38.329 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_id, val=first_model
2024-08-10 14:36:38.330 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_dir, val=e:\dev_um_ai\dev-um-ai\models\first_model
2024-08-10 14:36:38.332 | INFO     | src.model_manager.base_manager:__init__:43 - make directory. path: e:\dev_um_ai\dev-um-ai\models\first_model
2024-08-10 14:36:38.335 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_analyze_dir, val=e:\dev_um_ai\dev-um-ai\models\first_model\analyze
2024-08-10 14:36:38.338 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_predict_dir, val=e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict
2024-08-10 14:36:38.340 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_column, val=pred_prob
2024-08-10 14:36:38.341 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_rank_column, val=pred_rank
 

説明変数と目的変数の作成¶

In [4]:
# 説明変数にするカラム
feature_columns = [
    'distance',
    'number',
    'boxNum',
    'odds',
    'favorite',
    'age',
    'jweight',
    'weight',
    'gl',
    'race_span',
    "raceGrade"  # グレード情報を追加
] + dataPreP.encoding_columns  # カテゴリカラムを追加
# カテゴリカラム一覧
# "place", "field", "sex", "condition", "jockeyId", "teacherId", "dist_cat",  "horseId"


# 目的変数用のカラム
objective_column = "label_in1"

# 説明変数と目的変数をモデル作成用のインスタンスへセット
lgbm_model_manager.set_feature_and_objective_columns(
    feature_columns, objective_column)

# 目的変数の作成: 1着のデータに正解フラグを立てる処理を実行
df = lgbm_model_manager.add_objective_column_to_df(df, "label", 1)
 
2024-08-10 14:46:13.622 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:77 - Set Feature columns. ['distance', 'number', 'boxNum', 'odds', 'favorite', 'age', 'jweight', 'weight', 'gl', 'race_span', 'raceGrade', 'place_en', 'field_en', 'sex_en', 'condition_en', 'jockeyId_en', 'teacherId_en', 'dist_cat_en', 'horseId_en']
2024-08-10 14:46:13.625 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:79 - Set Objective columns. label_in1
2024-08-10 14:46:13.626 | INFO     | src.model_manager.lgbm_manager:add_objective_column_to_df:80 - make objective data. label_in1. topN: 1
 

4-2-2.データセット作成¶

 

データセットの作成では、以下の学習とテストデータを作成する

学習データ
2010年から検証データ直前まで
検証データ
学習データから半年間
予測データ
2019年1月から6月まで
2019年7月から12月まで
2020年1月から6月まで
2020年7月から12月まで
2021年1月から6月まで
2021年7月から12月まで
2022年1月から6月まで
2022年7月から12月まで
2023年1月から6月まで
2023年7月から12月まで

上記のデータセットは以下の2行を実行するだけで作成できる

In [5]:
dataset_mapping = lgbm_model_manager.make_dataset_mapping(df)
dataset_mapping = lgbm_model_manager.setup_dataset(dataset_mapping)
 
2024-08-10 14:55:04.157 | INFO     | src.data_manager.dataset_tools:make_dataset_mapping:103 - Generate dataset mapping. Year Range: 2019 -> 2023
2024-08-10 14:55:07.072 | INFO     | src.model_manager.lgbm_manager:setup_dataset:110 - Create LightGBM Dataset.
 

4-2-3.学習実行¶

In [6]:
# lightGBM用のモデルパラメータ
# パラメータ自体は適当にする。
params = {
    'boosting_type': 'gbdt',
    # 二値分類
    'objective': 'binary',
    'metric': 'auc',
    'verbose': 0,
    'seed': 77777,
    'learning_rate': 0.01,
    "n_estimators": 10000
}

lgbm_model_manager.train_all(
    params,
    dataset_mapping,
    stopping_rounds=500,  # ここで指定した値を超えるまでは、early stopさせない
    val_num=250  # ログを出力するスパン
)
 
2024-08-10 14:59:18.928 | INFO     | src.model_manager.lgbm_manager:save_root_model_info:281 - Save model params and dataset columns
2024-08-10 14:59:18.933 | INFO     | src.model_manager.lgbm_manager:train_all:262 - Training Start!
2024-08-10 14:59:18.934 | INFO     | src.model_manager.lgbm_manager:train_all:263 - ==================  train params  ========================
2024-08-10 14:59:18.935 | INFO     | src.model_manager.lgbm_manager:train_all:266 - boosting_type             =     gbdt
2024-08-10 14:59:18.936 | INFO     | src.model_manager.lgbm_manager:train_all:266 - objective                 =     binary
2024-08-10 14:59:18.938 | INFO     | src.model_manager.lgbm_manager:train_all:266 - metric                    =     auc
2024-08-10 14:59:18.940 | INFO     | src.model_manager.lgbm_manager:train_all:266 - verbose                   =     0
2024-08-10 14:59:18.941 | INFO     | src.model_manager.lgbm_manager:train_all:266 - seed                      =     77777
2024-08-10 14:59:18.942 | INFO     | src.model_manager.lgbm_manager:train_all:266 - learning_rate             =     0.01
2024-08-10 14:59:18.944 | INFO     | src.model_manager.lgbm_manager:train_all:266 - n_estimators              =     10000
2024-08-10 14:59:18.945 | INFO     | src.model_manager.lgbm_manager:train_all:267 - ==========================================================
2024-08-10 14:59:18.946 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019first
2024-08-10 14:59:18.948 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 14:59:19.182 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 14:59:19.184 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 14:59:19.422 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 14:59:30.076 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.871807	valid_1's auc: 0.840632
2024-08-10 14:59:50.062 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.89677	valid_1's auc: 0.840119
2024-08-10 14:59:55.592 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[63]	training's auc: 0.851271	valid_1's auc: 0.840833
2024-08-10 14:59:55.771 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2019first\model.params
2024-08-10 14:59:56.891 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019second
2024-08-10 14:59:56.892 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 14:59:57.073 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 14:59:57.075 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 14:59:57.300 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:00:09.275 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.870639	valid_1's auc: 0.819911
2024-08-10 15:00:31.367 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.894912	valid_1's auc: 0.819113
2024-08-10 15:00:55.486 | INFO     | lightgbm.basic:_log_info:191 - [750]	training's auc: 0.909803	valid_1's auc: 0.817025
2024-08-10 15:00:56.263 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[258]	training's auc: 0.871511	valid_1's auc: 0.819973
2024-08-10 15:00:57.070 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2019second\model.params
2024-08-10 15:00:58.334 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020first
2024-08-10 15:00:58.335 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:00:58.559 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:00:58.561 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:00:58.806 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:01:11.586 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.86771	valid_1's auc: 0.832936
2024-08-10 15:01:36.192 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.891366	valid_1's auc: 0.830434
2024-08-10 15:01:55.015 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[170]	training's auc: 0.859468	valid_1's auc: 0.833484
2024-08-10 15:01:55.543 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2020first\model.params
2024-08-10 15:01:56.708 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020second
2024-08-10 15:01:56.709 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:01:56.934 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:01:56.936 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:01:57.198 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:02:10.774 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.866477	valid_1's auc: 0.82393
2024-08-10 15:02:37.416 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.889361	valid_1's auc: 0.822261
2024-08-10 15:02:53.458 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[133]	training's auc: 0.855343	valid_1's auc: 0.824802
2024-08-10 15:02:53.859 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2020second\model.params
2024-08-10 15:02:54.969 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021first
2024-08-10 15:02:54.971 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:02:55.160 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:02:55.161 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:02:55.441 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:03:10.001 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.86393	valid_1's auc: 0.822076
2024-08-10 15:03:38.653 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.886215	valid_1's auc: 0.820405
2024-08-10 15:03:41.101 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[20]	training's auc: 0.842479	valid_1's auc: 0.822224
2024-08-10 15:03:41.235 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2021first\model.params
2024-08-10 15:03:42.247 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021second
2024-08-10 15:03:42.248 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:03:42.446 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:03:42.447 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:03:42.705 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:03:58.465 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.861851	valid_1's auc: 0.835742
2024-08-10 15:04:29.155 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.883583	valid_1's auc: 0.834695
2024-08-10 15:04:46.434 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[124]	training's auc: 0.850631	valid_1's auc: 0.836322
2024-08-10 15:04:46.789 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2021second\model.params
2024-08-10 15:04:47.909 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022first
2024-08-10 15:04:47.910 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:04:48.094 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:04:48.096 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:04:48.389 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:05:05.606 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.861216	valid_1's auc: 0.826826
2024-08-10 15:05:39.098 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.882493	valid_1's auc: 0.825194
2024-08-10 15:05:52.470 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[88]	training's auc: 0.847512	valid_1's auc: 0.827772
2024-08-10 15:05:52.728 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2022first\model.params
2024-08-10 15:05:53.764 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022second
2024-08-10 15:05:53.765 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:05:53.994 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:05:53.996 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:05:54.294 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:06:11.673 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859738	valid_1's auc: 0.843091
2024-08-10 15:06:46.386 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.880705	valid_1's auc: 0.842285
2024-08-10 15:07:22.512 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[233]	training's auc: 0.85823	valid_1's auc: 0.843208
2024-08-10 15:07:23.364 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2022second\model.params
2024-08-10 15:07:24.657 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023first
2024-08-10 15:07:24.658 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:07:24.888 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:07:24.889 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:07:25.214 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:07:43.829 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859688	valid_1's auc: 0.839903
2024-08-10 15:08:20.570 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.8801	valid_1's auc: 0.838717
2024-08-10 15:08:56.642 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[221]	training's auc: 0.856973	valid_1's auc: 0.840087
2024-08-10 15:08:57.454 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2023first\model.params
2024-08-10 15:08:58.741 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023second
2024-08-10 15:08:58.742 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-10 15:08:58.962 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-10 15:08:58.963 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-10 15:08:59.285 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-10 15:09:18.716 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859039	valid_1's auc: 0.838319
2024-08-10 15:09:57.586 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.879004	valid_1's auc: 0.836402
2024-08-10 15:10:00.030 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[13]	training's auc: 0.840206	valid_1's auc: 0.838479
2024-08-10 15:10:00.168 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\first_model\params\2023second\model.params
 

4-2-4.結果の確認¶

 

上記の学習では、学習データを使ってモデルの学習を行い、検証データで汎化性能を担保していた

本当のモデルの性能を測る上では、別で用意していたテストデータを推論した結果が必要になる

なので、ここではモデルの学習で使わなかったテストデータの推論を行う

テストデータの推論は以下のコードを実行するだけでできる

In [7]:
for dataset_dict in dataset_mapping.values():
    lgbm_model_manager.load_model(dataset_dict.name)
    lgbm_model_manager.predict(dataset_dict)
 
2024-08-10 15:51:55.406 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019first
2024-08-10 15:51:55.469 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019first
2024-08-10 15:51:56.554 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019first
2024-08-10 15:51:56.590 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2019first
2024-08-10 15:51:59.518 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019second
2024-08-10 15:51:59.639 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019second
2024-08-10 15:52:02.309 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019second
2024-08-10 15:52:02.355 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2019second
2024-08-10 15:52:05.419 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020first
2024-08-10 15:52:05.515 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020first
2024-08-10 15:52:07.600 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020first
2024-08-10 15:52:07.636 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2020first
2024-08-10 15:52:10.760 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020second
2024-08-10 15:52:10.830 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020second
2024-08-10 15:52:12.680 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020second
2024-08-10 15:52:12.734 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2020second
2024-08-10 15:52:16.109 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021first
2024-08-10 15:52:16.161 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021first
2024-08-10 15:52:17.602 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021first
2024-08-10 15:52:17.659 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2021first
2024-08-10 15:52:21.303 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021second
2024-08-10 15:52:21.363 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021second
2024-08-10 15:52:23.450 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021second
2024-08-10 15:52:23.505 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2021second
2024-08-10 15:52:27.362 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022first
2024-08-10 15:52:27.423 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022first
2024-08-10 15:52:29.321 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022first
2024-08-10 15:52:29.381 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2022first
2024-08-10 15:52:33.474 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022second
2024-08-10 15:52:33.624 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022second
2024-08-10 15:52:37.295 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022second
2024-08-10 15:52:37.356 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2022second
2024-08-10 15:52:41.289 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023first
2024-08-10 15:52:41.411 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023first
2024-08-10 15:52:45.080 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023first
2024-08-10 15:52:45.142 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2023first
2024-08-10 15:52:49.316 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023second
2024-08-10 15:52:49.367 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023second
2024-08-10 15:52:51.391 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023second
2024-08-10 15:52:51.462 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\00_predict\2023second
 

4-2-5.モデルエクスポート¶

 

モデルのエクスポートをするためには、モデルの成績を先に計算しておく必要がある

モデルの成績は以下を計算する

  1. 収支の計算
  2. 基礎統計の計算
  3. オッズグラフの計算
 

収支の計算¶

 

収支計算はメンテが出来てないので少し泥臭いが、以下のコードを実行して貰えればよい

現在のソースでは単勝の収支しか計算できない

In [8]:
bet_mode = BetName.tan
bet_column = lgbm_model_manager.get_bet_column(bet_mode=bet_mode)
pl_column = lgbm_model_manager.get_profit_loss_column(bet_mode=bet_mode)
for dataset_dict in dataset_mapping.values():
    lgbm_model_manager.set_bet_column(dataset_dict, bet_mode)
_, dfbetva, dfbette = lgbm_model_manager.merge_dataframe_data(
    dataset_mapping, mode=True)

dfbetva, dfbette = lgbm_model_manager.generate_profit_loss(
    dfbetva, dfbette, bet_mode)

dfbette[["raceDate", "raceId", "label", "favorite", bet_column, pl_column]]
 
2024-08-10 16:18:08.425 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=bet_columns_map, val={'tan': 'bet_tan'}
2024-08-10 16:18:08.425 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=pl_column_map, val={'tan': 'pl_tan'}
2024-08-10 16:18:09.374 | INFO     | src.model_manager.base_manager:__save_profit_loss:646 - Save profit loss data. save_path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\tan\profit_loss
2024-08-10 16:18:09.376 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=profit_loss_dir, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\first_model\\analyze\\tan\\profit_loss'}
Out[8]:
  raceDate raceId label favorite bet_tan pl_tan
896624 2019-01-05 201906010101 4 1 1 -100.0
896642 2019-01-05 201906010102 3 1 1 -100.0
896651 2019-01-05 201906010103 1 1 1 140.0
896670 2019-01-05 201906010104 2 1 1 -100.0
896693 2019-01-05 201906010105 2 1 1 -100.0
1126007 2023-12-28 202309050908 1 1 1 170.0
1126030 2023-12-28 202309050909 1 1 1 250.0
1126031 2023-12-28 202309050910 4 1 1 -100.0
1126051 2023-12-28 202309050911 2 1 1 -100.0
1126061 2023-12-28 202309050912 3 1 1 -100.0

16630 rows × 6 columns

 

基礎統計の計算¶

 

基礎統計では回収率と的中率および人気別のベット回数の集計を行う

以下のコードを実行するだけ

In [9]:
lgbm_model_manager.basic_analyze(dataset_mapping)
 
2024-08-10 16:18:17.410 | INFO     | src.model_manager.base_manager:basic_analyze:220 - Start basic analyze.
2024-08-10 16:18:17.901 | INFO     | src.model_manager.base_manager:basic_analyze:256 - Saving Return And Hit Rate Summary.
2024-08-10 16:18:17.909 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=return_hit_rate_file, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\first_model\\analyze\\tan\\hit_and_return_rate.csv'}
2024-08-10 16:18:17.910 | INFO     | src.model_manager.base_manager:basic_analyze:259 - Saving Favorite Bet Num Summary.
2024-08-10 16:18:17.922 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=fav_bet_num_dir, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\first_model\\analyze\\tan\\fav_bet_num'}
 

オッズグラフの計算¶

In [10]:
dftrain, dfvalid, dftest = lgbm_model_manager.merge_dataframe_data(
    dataset_mapping,
    mode=True
)
summary_dict = lgbm_model_manager.gegnerate_odds_graph(
    dftrain, dfvalid, dftest, bet_mode)
print("'test'データのオッズグラフを確認")
summary_dict["test"].fillna(0)
 
2024-08-10 16:18:23.029 | INFO     | src.model_manager.base_manager:__save_odds_graph:514 - Save Odds Graph. save_path: e:\dev_um_ai\dev-um-ai\models\first_model\analyze\tan\odds_graph
2024-08-10 16:18:23.030 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=odds_graph_file, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\first_model\\analyze\\tan\\odds_graph'}
 
'test'データのオッズグラフを確認
Out[10]:
  勝率 支持率 回収率100%超 weight 件数
odds_round          
1.25 65.805471 64.000000 80.000000 0.039567 658
1.75 46.336567 45.714286 57.142857 0.179735 2989
2.25 35.981967 35.555556 44.444444 0.213410 3549
2.75 29.793673 29.090909 36.363636 0.218581 3635
3.25 24.479568 24.615385 30.769231 0.155983 2594
3.75 23.122413 21.333333 26.666667 0.101684 1691
4.25 14.965986 18.823529 23.529412 0.044197 735
4.75 16.666667 16.842105 21.052632 0.023452 390
5.25 16.818182 15.238095 19.047619 0.013229 220
5.75 9.523810 13.913043 17.391304 0.005051 84
6.25 13.513514 12.800000 16.000000 0.002225 37
6.75 5.555556 11.851852 14.814815 0.001082 18
7.25 9.090909 11.034483 13.793103 0.000661 11
7.75 16.666667 10.322581 12.903226 0.000361 6
8.75 50.000000 9.142857 11.428571 0.000120 2
9.25 0.000000 8.648649 10.810811 0.000060 1
9.75 0.000000 8.205128 10.256410 0.000241 4
10.25 0.000000 7.804878 9.756098 0.000060 1
10.75 0.000000 7.441860 9.302326 0.000060 1
15.75 0.000000 5.079365 6.349206 0.000060 1
22.25 100.000000 3.595506 4.494382 0.000060 1
34.75 0.000000 2.302158 2.877698 0.000060 1
50.00 0.000000 1.600000 2.000000 0.000060 1
 

モデルのエクスポート¶

 

以下を実行することでモデルの成績をエクスポートできる

In [11]:
lgbm_model_manager.export_model_info()
 
2024-08-10 16:22:41.109 | INFO     | src.model_manager.base_manager:export_model_info:848 - Export Model info json. export path: e:\dev_um_ai\dev-um-ai\models\first_model\model_info.json
 
スポンサーリンク

4-3.性能の確認(WEBアプリ起動)¶

 

以下のコードを実行するとWEBアプリが起動します

In [11]:
! python ../app_keiba/manage.py makemigrations
! python ../app_keiba/manage.py migrate 
! echo server launch OK
# ! python ../app_keiba/manage.py runserver 12345
 
No changes detected
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, model_analyzer, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying admin.0003_logentry_add_action_flag_choices... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying auth.0009_alter_user_last_name_max_length... OK
  Applying auth.0010_alter_group_name_max_length... OK
  Applying auth.0011_update_proxy_permissions... OK
  Applying auth.0012_alter_user_first_name_max_length... OK
  Applying model_analyzer.0001_initial... OK
  Applying model_analyzer.0002_alter_modellist_motivate... OK
  Applying sessions.0001_initial... OK
server launch OK
 

「server launch OK」の表示がでたら以下のリンクをクリックしてWEBアプリへアクセス

http://localhost:12345/index.html
 
  モデルID 支持率OGS 回収率OGS AonBOGS
1 first_model
(baseline)
0.41924 -7.64492  
 
スポンサーリンク

4-4.芝とダートを分ける方法¶

 

芝とダートを分けて学習させるには、以下2つの考え方がある

パターン1
検証データとテストデータのみ芝とダートでデータを分ける

パターン2すべてのデータセットで芝とダートにデータを分ける

 

特定の条件でデータを分ける場合、本体のソースに手を加える必要がある。
現状、効果が期待できない以上ソースを変えるのはリスクがあるので、ここではモデルを芝とダートで別々に作ってマージさせる方式をとる

 
スポンサーリンク

4-5.芝,ダート別版モデルの学習¶

 

4-5-1.パターン1のモデルの作成¶

 

データを芝とダートに分ける¶

In [12]:
dataset_grass, dataset_dirt = {}, {}
for key, dataset in dataset_mapping.items():
    idfv = dataset.valid
    idfv_g, idfv_d = idfv[idfv["field"].isin(
        ["芝"])], idfv[~idfv["field"].isin(["芝"])]
    idft = dataset.test
    idft_g, idft_d = idft[idft["field"].isin(
        ["芝"])], idft[~idft["field"].isin(["芝"])]

    # region 芝のデータセット
    dataset_g = LightGBMDataset(key, dataset.train, idfv_g, idft_g)
    dataset_g.train_dataset, dataset_g.valid_dataset, dataset_g.test_dataset = \
        lgbm.Dataset(dataset.train[feature_columns], dataset.train[objective_column]), \
        lgbm.Dataset(idfv_g[feature_columns], idfv_g[objective_column]), \
        lgbm.Dataset(idft_g[feature_columns], idft_g[objective_column])
    # endregion
    # region ダートのデータセット
    dataset_d = LightGBMDataset(key, dataset.train, idfv_d, idft_d)
    dataset_d.train_dataset, dataset_d.valid_dataset, dataset_d.test_dataset = \
        lgbm.Dataset(dataset.train[feature_columns], dataset.train[objective_column]), \
        lgbm.Dataset(idfv_d[feature_columns], idfv_d[objective_column]), \
        lgbm.Dataset(idft_d[feature_columns], idft_d[objective_column])
    dataset_grass[key] = dataset_g
    dataset_dirt[key] = dataset_d
    # endregion
 

まずは芝のみのモデルを作成¶

 
学習実行¶
In [13]:
lgbm_model_manager = LightGBMModelManager(
    # 芝のみのモデルフォルダ作成
    root_dir / "models" / "model_only_grass",
    split_year,
    target_year,
    end_year
)
lgbm_model_manager.set_feature_and_objective_columns(
    feature_columns, objective_column)
lgbm_model_manager.topN = 1

lgbm_model_manager.train_all(
    params,
    dataset_grass,
    stopping_rounds=500,  # ここで指定した値を超えるまでは、early stopさせない
    val_num=250  # ログを出力するスパン
)
 
2024-08-05 04:15:40.343 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_type, val=lightGBM
2024-08-05 04:15:40.345 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_id, val=model_only_grass
2024-08-05 04:15:40.347 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_grass
2024-08-05 04:15:40.348 | INFO     | src.model_manager.base_manager:__init__:43 - make directory. path: e:\dev_um_ai\dev-um-ai\models\model_only_grass
2024-08-05 04:15:40.353 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_analyze_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze
2024-08-05 04:15:40.354 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_predict_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict
2024-08-05 04:15:40.355 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_column, val=pred_prob
2024-08-05 04:15:40.356 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_rank_column, val=pred_rank
2024-08-05 04:15:40.359 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:77 - Set Feature columns. ['distance', 'number', 'boxNum', 'odds', 'favorite', 'age', 'jweight', 'weight', 'gl', 'race_span', 'raceGrade', 'place_en', 'field_en', 'sex_en', 'condition_en', 'jockeyId_en', 'teacherId_en', 'dist_cat_en', 'horseId_en']
2024-08-05 04:15:40.360 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:79 - Set Objective columns. label_in1
2024-08-05 04:15:40.361 | INFO     | src.model_manager.lgbm_manager:save_root_model_info:281 - Save model params and dataset columns
2024-08-05 04:15:40.366 | INFO     | src.model_manager.lgbm_manager:train_all:262 - Training Start!
2024-08-05 04:15:40.367 | INFO     | src.model_manager.lgbm_manager:train_all:263 - ==================  train params  ========================
2024-08-05 04:15:40.368 | INFO     | src.model_manager.lgbm_manager:train_all:266 - boosting_type             =     gbdt
2024-08-05 04:15:40.369 | INFO     | src.model_manager.lgbm_manager:train_all:266 - objective                 =     binary
2024-08-05 04:15:40.370 | INFO     | src.model_manager.lgbm_manager:train_all:266 - metric                    =     auc
2024-08-05 04:15:40.371 | INFO     | src.model_manager.lgbm_manager:train_all:266 - verbose                   =     0
2024-08-05 04:15:40.373 | INFO     | src.model_manager.lgbm_manager:train_all:266 - seed                      =     77777
2024-08-05 04:15:40.374 | INFO     | src.model_manager.lgbm_manager:train_all:266 - learning_rate             =     0.01
2024-08-05 04:15:40.375 | INFO     | src.model_manager.lgbm_manager:train_all:266 - n_estimators              =     10000
2024-08-05 04:15:40.375 | INFO     | src.model_manager.lgbm_manager:train_all:267 - ==========================================================
2024-08-05 04:15:40.376 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019first
2024-08-05 04:15:40.377 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:15:40.566 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:15:40.568 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:15:40.777 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:15:51.188 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.871807	valid_1's auc: 0.846729
2024-08-05 04:16:11.179 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.89677	valid_1's auc: 0.846496
2024-08-05 04:16:12.590 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[15]	training's auc: 0.843835	valid_1's auc: 0.847674
2024-08-05 04:16:12.701 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2019first\model.params
2024-08-05 04:16:13.657 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019second
2024-08-05 04:16:13.658 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:16:13.821 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:16:13.823 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:16:14.023 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:16:25.823 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.870639	valid_1's auc: 0.814135
2024-08-05 04:16:48.232 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.894912	valid_1's auc: 0.812338
2024-08-05 04:16:49.919 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[16]	training's auc: 0.84428	valid_1's auc: 0.815285
2024-08-05 04:16:50.027 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2019second\model.params
2024-08-05 04:16:50.979 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020first
2024-08-05 04:16:50.980 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:16:51.174 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:16:51.175 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:16:51.388 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:17:04.015 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.86771	valid_1's auc: 0.837084
2024-08-05 04:17:28.689 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.891366	valid_1's auc: 0.833318
2024-08-05 04:17:52.496 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[216]	training's auc: 0.864218	valid_1's auc: 0.837541
2024-08-05 04:17:53.126 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2020first\model.params
2024-08-05 04:17:54.311 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020second
2024-08-05 04:17:54.312 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:17:54.502 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:17:54.503 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:17:54.727 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:18:08.198 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.866477	valid_1's auc: 0.809263
2024-08-05 04:18:35.366 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.889361	valid_1's auc: 0.806705
2024-08-05 04:18:44.266 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[71]	training's auc: 0.8496	valid_1's auc: 0.810884
2024-08-05 04:18:44.466 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2020second\model.params
2024-08-05 04:18:45.475 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021first
2024-08-05 04:18:45.477 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:18:45.687 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:18:45.688 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:18:45.925 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:19:00.435 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.86393	valid_1's auc: 0.825407
2024-08-05 04:19:29.590 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.886215	valid_1's auc: 0.823367
2024-08-05 04:20:01.442 | INFO     | lightgbm.basic:_log_info:191 - [750]	training's auc: 0.900037	valid_1's auc: 0.821308
2024-08-05 04:20:01.830 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[253]	training's auc: 0.864237	valid_1's auc: 0.825461
2024-08-05 04:20:02.629 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2021first\model.params
2024-08-05 04:20:03.838 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021second
2024-08-05 04:20:03.839 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:20:04.039 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:20:04.041 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:20:04.287 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:20:19.879 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.861851	valid_1's auc: 0.825318
2024-08-05 04:20:50.933 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.883583	valid_1's auc: 0.82441
2024-08-05 04:21:12.691 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[154]	training's auc: 0.853197	valid_1's auc: 0.82568
2024-08-05 04:21:13.167 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2021second\model.params
2024-08-05 04:21:14.238 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022first
2024-08-05 04:21:14.240 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:21:14.451 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:21:14.453 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:21:14.709 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:21:31.799 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.861216	valid_1's auc: 0.832859
2024-08-05 04:22:06.034 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.882493	valid_1's auc: 0.831195
2024-08-05 04:22:08.678 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[17]	training's auc: 0.840881	valid_1's auc: 0.833574
2024-08-05 04:22:08.804 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2022first\model.params
2024-08-05 04:22:09.805 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022second
2024-08-05 04:22:09.807 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:22:10.019 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:22:10.021 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:22:10.278 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:22:27.523 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859738	valid_1's auc: 0.834225
2024-08-05 04:23:02.911 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.880705	valid_1's auc: 0.832829
2024-08-05 04:23:40.013 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[233]	training's auc: 0.85823	valid_1's auc: 0.834375
2024-08-05 04:23:40.755 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2022second\model.params
2024-08-05 04:23:41.986 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023first
2024-08-05 04:23:41.987 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:23:42.202 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:23:42.203 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:23:42.510 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:24:00.995 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859688	valid_1's auc: 0.829658
2024-08-05 04:24:36.009 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.8801	valid_1's auc: 0.828485
2024-08-05 04:25:08.297 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[221]	training's auc: 0.856973	valid_1's auc: 0.830037
2024-08-05 04:25:09.070 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2023first\model.params
2024-08-05 04:25:10.297 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023second
2024-08-05 04:25:10.297 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:25:10.486 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:25:10.487 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:25:10.736 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:25:29.769 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859039	valid_1's auc: 0.827814
2024-08-05 04:26:08.071 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.879004	valid_1's auc: 0.82535
2024-08-05 04:26:10.269 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[12]	training's auc: 0.839905	valid_1's auc: 0.828929
2024-08-05 04:26:10.375 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\params\2023second\model.params
 
推論実行¶
In [14]:
for dataset_dict in dataset_grass.values():
    lgbm_model_manager.load_model(dataset_dict.name)
    lgbm_model_manager.predict(dataset_dict)

bet_mode = BetName.tan
bet_column = lgbm_model_manager.get_bet_column(bet_mode=bet_mode)
pl_column = lgbm_model_manager.get_profit_loss_column(bet_mode=bet_mode)
for dataset_dict in dataset_grass.values():
    lgbm_model_manager.set_bet_column(dataset_dict, bet_mode)
_, dfbetva_grass, dfbette_grass = lgbm_model_manager.merge_dataframe_data(
    dataset_grass, mode=True)
 
2024-08-05 04:26:12.308 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019first
2024-08-05 04:26:12.358 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019first
2024-08-05 04:26:13.246 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019first
2024-08-05 04:26:13.305 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2019first
2024-08-05 04:26:15.896 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019second
2024-08-05 04:26:15.955 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019second
2024-08-05 04:26:16.903 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019second
2024-08-05 04:26:16.966 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2019second
2024-08-05 04:26:19.707 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020first
2024-08-05 04:26:19.805 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020first
2024-08-05 04:26:22.068 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020first
2024-08-05 04:26:22.122 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2020first
2024-08-05 04:26:25.033 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020second
2024-08-05 04:26:25.089 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020second
2024-08-05 04:26:26.453 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020second
2024-08-05 04:26:26.520 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2020second
2024-08-05 04:26:29.562 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021first
2024-08-05 04:26:29.671 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021first
2024-08-05 04:26:32.719 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021first
2024-08-05 04:26:32.787 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2021first
2024-08-05 04:26:35.955 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021second
2024-08-05 04:26:36.031 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021second
2024-08-05 04:26:38.239 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021second
2024-08-05 04:26:38.309 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2021second
2024-08-05 04:26:41.596 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022first
2024-08-05 04:26:41.658 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022first
2024-08-05 04:26:43.192 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022first
2024-08-05 04:26:43.272 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2022first
2024-08-05 04:26:46.737 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022second
2024-08-05 04:26:46.844 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022second
2024-08-05 04:26:50.430 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022second
2024-08-05 04:26:50.502 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2022second
2024-08-05 04:26:54.167 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023first
2024-08-05 04:26:54.300 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023first
2024-08-05 04:26:57.941 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023first
2024-08-05 04:26:58.016 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2023first
2024-08-05 04:27:01.781 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023second
2024-08-05 04:27:01.833 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023second
2024-08-05 04:27:03.679 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023second
2024-08-05 04:27:03.761 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass\analyze\00_predict\2023second
2024-08-05 04:27:06.731 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=bet_columns_map, val={'tan': 'bet_tan'}
2024-08-05 04:27:06.732 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=pl_column_map, val={'tan': 'pl_tan'}
 

ダートのみのモデル作成¶

 
学習実行¶
In [15]:
lgbm_model_manager_dirt = LightGBMModelManager(
    # ダートのみのモデルフォルダ作成
    root_dir / "models" / "model_only_dirt",
    split_year,
    target_year,
    end_year
)
lgbm_model_manager_dirt.set_feature_and_objective_columns(
    feature_columns, objective_column)
lgbm_model_manager_dirt.topN = 1

lgbm_model_manager_dirt.train_all(
    params,
    dataset_dirt,
    stopping_rounds=500,  # ここで指定した値を超えるまでは、early stopさせない
    val_num=250  # ログを出力するスパン
)
 
2024-08-05 04:27:07.338 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_type, val=lightGBM
2024-08-05 04:27:07.340 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_id, val=model_only_dirt
2024-08-05 04:27:07.341 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_dirt
2024-08-05 04:27:07.343 | INFO     | src.model_manager.base_manager:__init__:43 - make directory. path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt
2024-08-05 04:27:07.347 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_analyze_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze
2024-08-05 04:27:07.347 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_predict_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict
2024-08-05 04:27:07.348 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_column, val=pred_prob
2024-08-05 04:27:07.349 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_rank_column, val=pred_rank
2024-08-05 04:27:07.350 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:77 - Set Feature columns. ['distance', 'number', 'boxNum', 'odds', 'favorite', 'age', 'jweight', 'weight', 'gl', 'race_span', 'raceGrade', 'place_en', 'field_en', 'sex_en', 'condition_en', 'jockeyId_en', 'teacherId_en', 'dist_cat_en', 'horseId_en']
2024-08-05 04:27:07.351 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:79 - Set Objective columns. label_in1
2024-08-05 04:27:07.352 | INFO     | src.model_manager.lgbm_manager:save_root_model_info:281 - Save model params and dataset columns
2024-08-05 04:27:07.356 | INFO     | src.model_manager.lgbm_manager:train_all:262 - Training Start!
2024-08-05 04:27:07.356 | INFO     | src.model_manager.lgbm_manager:train_all:263 - ==================  train params  ========================
2024-08-05 04:27:07.357 | INFO     | src.model_manager.lgbm_manager:train_all:266 - boosting_type             =     gbdt
2024-08-05 04:27:07.358 | INFO     | src.model_manager.lgbm_manager:train_all:266 - objective                 =     binary
2024-08-05 04:27:07.359 | INFO     | src.model_manager.lgbm_manager:train_all:266 - metric                    =     auc
2024-08-05 04:27:07.360 | INFO     | src.model_manager.lgbm_manager:train_all:266 - verbose                   =     0
2024-08-05 04:27:07.361 | INFO     | src.model_manager.lgbm_manager:train_all:266 - seed                      =     77777
2024-08-05 04:27:07.362 | INFO     | src.model_manager.lgbm_manager:train_all:266 - learning_rate             =     0.01
2024-08-05 04:27:07.362 | INFO     | src.model_manager.lgbm_manager:train_all:266 - n_estimators              =     10000
2024-08-05 04:27:07.363 | INFO     | src.model_manager.lgbm_manager:train_all:267 - ==========================================================
2024-08-05 04:27:07.364 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019first
2024-08-05 04:27:07.365 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:27:07.543 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:27:07.544 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:27:07.763 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:27:17.996 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.871807	valid_1's auc: 0.833088
2024-08-05 04:27:37.923 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.89677	valid_1's auc: 0.832282
2024-08-05 04:27:59.627 | INFO     | lightgbm.basic:_log_info:191 - [750]	training's auc: 0.912683	valid_1's auc: 0.829816
2024-08-05 04:28:01.259 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[270]	training's auc: 0.874199	valid_1's auc: 0.833208
2024-08-05 04:28:02.048 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2019first\model.params
2024-08-05 04:28:03.282 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019second
2024-08-05 04:28:03.283 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:28:03.472 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:28:03.474 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:28:03.674 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:28:15.664 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.870639	valid_1's auc: 0.824531
2024-08-05 04:28:37.797 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.894912	valid_1's auc: 0.824518
2024-08-05 04:29:01.931 | INFO     | lightgbm.basic:_log_info:191 - [750]	training's auc: 0.909803	valid_1's auc: 0.823077
2024-08-05 04:29:19.679 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[441]	training's auc: 0.890461	valid_1's auc: 0.824739
2024-08-05 04:29:20.959 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2019second\model.params
2024-08-05 04:29:22.505 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020first
2024-08-05 04:29:22.506 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:29:22.669 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:29:22.671 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:29:22.887 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:29:35.552 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.86771	valid_1's auc: 0.82787
2024-08-05 04:30:00.386 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.891366	valid_1's auc: 0.826831
2024-08-05 04:30:13.713 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[116]	training's auc: 0.854617	valid_1's auc: 0.82877
2024-08-05 04:30:14.020 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2020first\model.params
2024-08-05 04:30:15.051 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020second
2024-08-05 04:30:15.052 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:30:15.240 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:30:15.242 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:30:15.467 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:30:29.156 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.866477	valid_1's auc: 0.835707
2024-08-05 04:30:56.858 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.889361	valid_1's auc: 0.834774
2024-08-05 04:31:13.608 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[133]	training's auc: 0.855343	valid_1's auc: 0.836302
2024-08-05 04:31:14.000 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2020second\model.params
2024-08-05 04:31:15.048 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021first
2024-08-05 04:31:15.049 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:31:15.238 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:31:15.240 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:31:15.480 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:31:29.984 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.86393	valid_1's auc: 0.817968
2024-08-05 04:31:59.341 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.886215	valid_1's auc: 0.816825
2024-08-05 04:32:01.727 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[17]	training's auc: 0.841838	valid_1's auc: 0.818268
2024-08-05 04:32:01.824 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2021first\model.params
2024-08-05 04:32:02.779 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021second
2024-08-05 04:32:02.780 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:32:02.977 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:32:02.979 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:32:03.228 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:32:18.864 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.861851	valid_1's auc: 0.843667
2024-08-05 04:32:50.088 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.883583	valid_1's auc: 0.84249
2024-08-05 04:32:54.977 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[33]	training's auc: 0.84323	valid_1's auc: 0.844655
2024-08-05 04:32:55.111 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2021second\model.params
2024-08-05 04:32:56.088 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022first
2024-08-05 04:32:56.090 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:32:56.321 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:32:56.323 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:32:56.579 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:33:13.852 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.861216	valid_1's auc: 0.81993
2024-08-05 04:33:48.301 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.882493	valid_1's auc: 0.818373
2024-08-05 04:34:03.216 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[94]	training's auc: 0.847917	valid_1's auc: 0.821375
2024-08-05 04:34:03.469 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2022first\model.params
2024-08-05 04:34:04.785 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022second
2024-08-05 04:34:04.787 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:34:05.003 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:34:05.004 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:34:05.273 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:34:22.654 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859738	valid_1's auc: 0.849977
2024-08-05 04:34:58.308 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.880705	valid_1's auc: 0.84974
2024-08-05 04:35:38.268 | INFO     | lightgbm.basic:_log_info:191 - [750]	training's auc: 0.893643	valid_1's auc: 0.849005
2024-08-05 04:35:44.163 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[291]	training's auc: 0.863579	valid_1's auc: 0.850116
2024-08-05 04:35:45.268 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2022second\model.params
2024-08-05 04:35:46.640 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023first
2024-08-05 04:35:46.640 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:35:46.842 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:35:46.843 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:35:47.154 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:36:05.385 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859688	valid_1's auc: 0.851099
2024-08-05 04:36:42.627 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.8801	valid_1's auc: 0.849899
2024-08-05 04:37:24.301 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[245]	training's auc: 0.859216	valid_1's auc: 0.85114
2024-08-05 04:37:25.156 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2023first\model.params
2024-08-05 04:37:26.438 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023second
2024-08-05 04:37:26.439 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:37:26.646 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:37:26.648 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:37:26.925 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:37:46.250 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.859039	valid_1's auc: 0.84669
2024-08-05 04:38:25.838 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.879004	valid_1's auc: 0.845218
2024-08-05 04:39:11.058 | INFO     | lightgbm.basic:_log_info:191 - [750]	training's auc: 0.891391	valid_1's auc: 0.843779
2024-08-05 04:39:13.360 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[263]	training's auc: 0.860106	valid_1's auc: 0.846709
2024-08-05 04:39:14.268 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\params\2023second\model.params
 
推論実行¶
In [16]:
for dataset_dict in dataset_dirt.values():
    lgbm_model_manager_dirt.load_model(dataset_dict.name)
    lgbm_model_manager_dirt.predict(dataset_dict)

bet_mode = BetName.tan
bet_column = lgbm_model_manager_dirt.get_bet_column(bet_mode=bet_mode)
pl_column = lgbm_model_manager_dirt.get_profit_loss_column(bet_mode=bet_mode)
for dataset_dict in dataset_dirt.values():
    lgbm_model_manager_dirt.set_bet_column(dataset_dict, bet_mode)
_, dfbetva_dirt, dfbette_dirt = lgbm_model_manager_dirt.merge_dataframe_data(
    dataset_dirt, mode=True)
 
2024-08-05 04:39:16.467 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019first
2024-08-05 04:39:16.618 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019first
2024-08-05 04:39:18.714 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019first
2024-08-05 04:39:18.772 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2019first
2024-08-05 04:39:21.337 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019second
2024-08-05 04:39:21.538 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019second
2024-08-05 04:39:26.255 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019second
2024-08-05 04:39:26.318 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2019second
2024-08-05 04:39:29.012 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020first
2024-08-05 04:39:29.106 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020first
2024-08-05 04:39:30.531 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020first
2024-08-05 04:39:30.599 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2020first
2024-08-05 04:39:33.384 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020second
2024-08-05 04:39:33.478 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020second
2024-08-05 04:39:35.158 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020second
2024-08-05 04:39:35.225 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2020second
2024-08-05 04:39:38.319 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021first
2024-08-05 04:39:38.381 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021first
2024-08-05 04:39:39.625 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021first
2024-08-05 04:39:39.699 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2021first
2024-08-05 04:39:42.912 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021second
2024-08-05 04:39:42.973 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021second
2024-08-05 04:39:44.509 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021second
2024-08-05 04:39:44.585 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2021second
2024-08-05 04:39:47.867 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022first
2024-08-05 04:39:47.951 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022first
2024-08-05 04:39:49.721 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022first
2024-08-05 04:39:49.796 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2022first
2024-08-05 04:39:53.267 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022second
2024-08-05 04:39:53.436 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022second
2024-08-05 04:39:57.637 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022second
2024-08-05 04:39:57.719 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2022second
2024-08-05 04:40:01.432 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023first
2024-08-05 04:40:01.586 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023first
2024-08-05 04:40:05.336 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023first
2024-08-05 04:40:05.415 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2023first
2024-08-05 04:40:09.129 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023second
2024-08-05 04:40:09.283 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023second
2024-08-05 04:40:13.574 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023second
2024-08-05 04:40:13.665 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt\analyze\00_predict\2023second
2024-08-05 04:40:16.606 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=bet_columns_map, val={'tan': 'bet_tan'}
2024-08-05 04:40:16.607 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=pl_column_map, val={'tan': 'pl_tan'}
 

モデル情報のエクスポート¶

 

ここが大変でWEBアプリでインポート出来るように、芝とダートの結果をマージする必要がある

In [17]:
lgbm_model_manager_merge = LightGBMModelManager(
    # マージ用のモデルフォルダを作成
    root_dir / "models" / "model_field_div_part1",
    split_year,
    target_year,
    end_year
)
lgbm_model_manager_merge.set_feature_and_objective_columns(
    feature_columns, objective_column)
lgbm_model_manager_merge.topN = 1
lgbm_model_manager_merge.expt_info_map.bet_columns_map = {"tan": "bet_tan"}
lgbm_model_manager_merge.expt_info_map.pl_column_map = {"tan": "pl_tan"}
lgbm_model_manager_merge.expt_info_map.confidence_column = "pred_prob"
lgbm_model_manager_merge.expt_info_map.confidence_rank_column = "pred_rank"
 
2024-08-05 04:40:17.260 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_type, val=lightGBM
2024-08-05 04:40:17.261 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_id, val=model_grass_dirt_part1
2024-08-05 04:40:17.263 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_dir, val=e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part1
2024-08-05 04:40:17.265 | INFO     | src.model_manager.base_manager:__init__:43 - make directory. path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part1
2024-08-05 04:40:17.271 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_analyze_dir, val=e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part1\analyze
2024-08-05 04:40:17.273 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_predict_dir, val=e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part1\analyze\00_predict
2024-08-05 04:40:17.274 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_column, val=pred_prob
2024-08-05 04:40:17.275 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_rank_column, val=pred_rank
2024-08-05 04:40:17.277 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:77 - Set Feature columns. ['distance', 'number', 'boxNum', 'odds', 'favorite', 'age', 'jweight', 'weight', 'gl', 'race_span', 'raceGrade', 'place_en', 'field_en', 'sex_en', 'condition_en', 'jockeyId_en', 'teacherId_en', 'dist_cat_en', 'horseId_en']
2024-08-05 04:40:17.278 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:79 - Set Objective columns. label_in1
 

収支の計算

In [18]:
import pandas as pd

dfbetva, dfbette = \
    pd.concat(
        [dfbetva_dirt, dfbetva_grass]
    ).sort_values(["raceDate", "raceId"], ignore_index=True), \
    pd.concat(
        [dfbette_dirt, dfbette_grass]
    ).sort_values(["raceDate", "raceId"], ignore_index=True)

dfbetva, dfbette = lgbm_model_manager_merge.generate_profit_loss(
    dfbetva, dfbette, bet_mode)
 
2024-08-05 04:40:17.873 | INFO     | src.model_manager.base_manager:__save_profit_loss:646 - Save profit loss data. save_path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part1\analyze\tan\profit_loss
2024-08-05 04:40:17.875 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=profit_loss_dir, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part1\\analyze\\tan\\profit_loss'}
 

基礎情報の計算

In [19]:
dataset_merge = {}
for (key, dset_g), dset_d in zip(dataset_grass.items(), dataset_dirt):
    dset_m = LightGBMDataset(key, None, None, None)
    dset_m.pred_train = pd.concat(
        [dset_g.pred_train, dset_g.pred_train]).sort_values(
            ["raceDate", "raceId"], ignore_index=True)
    dset_m.pred_test = pd.concat(
        [dset_g.pred_test, dset_g.pred_test]).sort_values(
            ["raceDate", "raceId"], ignore_index=True)
    dset_m.pred_valid = pd.concat(
        [dset_g.pred_valid, dset_g.pred_valid]).sort_values(
            ["raceDate", "raceId"], ignore_index=True)
    dataset_merge[key] = dset_m


lgbm_model_manager_merge.model_name_list = lgbm_model_manager.model_name_list
lgbm_model_manager_merge.basic_analyze(dataset_merge)
 
2024-08-05 04:40:20.941 | INFO     | src.model_manager.base_manager:basic_analyze:220 - Start basic analyze.
2024-08-05 04:40:21.600 | INFO     | src.model_manager.base_manager:basic_analyze:256 - Saving Return And Hit Rate Summary.
2024-08-05 04:40:21.610 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=return_hit_rate_file, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part1\\analyze\\tan\\hit_and_return_rate.csv'}
2024-08-05 04:40:21.612 | INFO     | src.model_manager.base_manager:basic_analyze:259 - Saving Favorite Bet Num Summary.
2024-08-05 04:40:21.627 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=fav_bet_num_dir, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part1\\analyze\\tan\\fav_bet_num'}
 

オッズグラフの作成

In [20]:
dft_g, dfv_g, dfte_g = lgbm_model_manager.merge_dataframe_data(
    dataset_grass,
    mode=True
)

dft_d, dfv_d, dfte_d = lgbm_model_manager_dirt.merge_dataframe_data(
    dataset_dirt,
    mode=True
)

dft_m, dfv_m, dfte_m = \
    pd.concat([dft_g, dft_d]), pd.concat(
        [dfv_g, dfv_d]), pd.concat([dfte_g, dfte_d])

summary_dict = lgbm_model_manager_merge.gegnerate_odds_graph(
    dft_m, dfv_m, dfte_m, bet_mode)
 
2024-08-05 04:40:23.771 | INFO     | src.model_manager.base_manager:__save_odds_graph:514 - Save Odds Graph. save_path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part1\analyze\tan\odds_graph
2024-08-05 04:40:23.773 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=odds_graph_file, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part1\\analyze\\tan\\odds_graph'}
 

モデル情報のエクスポート

In [21]:
lgbm_model_manager_merge.export_model_info()
 
2024-08-05 04:40:23.814 | INFO     | src.model_manager.base_manager:export_model_info:848 - Export Model info json. export path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part1\model_info.json
 

4-5-2.パターン2のモデル作成¶

 

データを芝とダートに分ける¶

In [22]:
dataset_grass, dataset_dirt = {}, {}
for key, dataset in dataset_mapping.items():
    idftr = dataset.train
    idftr_g, idftr_d = idftr[idftr["field"].isin(
        ["芝"])], idftr[~idftr["field"].isin(["芝"])]
    idfv = dataset.valid
    idfv_g, idfv_d = idfv[idfv["field"].isin(
        ["芝"])], idfv[~idfv["field"].isin(["芝"])]
    idft = dataset.test
    idft_g, idft_d = idft[idft["field"].isin(
        ["芝"])], idft[~idft["field"].isin(["芝"])]

    # region 芝のデータセット
    dataset_g = LightGBMDataset(key, idftr_g, idfv_g, idft_g)
    dataset_g.train_dataset, dataset_g.valid_dataset, dataset_g.test_dataset = \
        lgbm.Dataset(idftr_g[feature_columns], idftr_g[objective_column]), \
        lgbm.Dataset(idfv_g[feature_columns], idfv_g[objective_column]), \
        lgbm.Dataset(idft_g[feature_columns], idft_g[objective_column])
    # endregion
    # region ダートのデータセット
    dataset_d = LightGBMDataset(key, dataset.train, idfv_d, idft_d)
    dataset_d.train_dataset, dataset_d.valid_dataset, dataset_d.test_dataset = \
        lgbm.Dataset(idftr_d[feature_columns], idftr_d[objective_column]), \
        lgbm.Dataset(idfv_d[feature_columns], idfv_d[objective_column]), \
        lgbm.Dataset(idft_d[feature_columns], idft_d[objective_column])
    dataset_grass[key] = dataset_g
    dataset_dirt[key] = dataset_d
    # endregion
 

まずは芝のみのモデルを作成¶

 
学習実行¶
In [23]:
lgbm_model_manager = LightGBMModelManager(
    # 芝のみのモデルフォルダ作成
    root_dir / "models" / "model_only_grass2",
    split_year,
    target_year,
    end_year
)
lgbm_model_manager.set_feature_and_objective_columns(
    feature_columns, objective_column)
lgbm_model_manager.topN = 1

lgbm_model_manager.train_all(
    params,
    dataset_grass,
    stopping_rounds=500,  # ここで指定した値を超えるまでは、early stopさせない
    val_num=250  # ログを出力するスパン
)
 
2024-08-05 04:40:27.691 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_type, val=lightGBM
2024-08-05 04:40:27.693 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_id, val=model_only_grass2
2024-08-05 04:40:27.694 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_grass2
2024-08-05 04:40:27.695 | INFO     | src.model_manager.base_manager:__init__:43 - make directory. path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2
2024-08-05 04:40:27.699 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_analyze_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze
2024-08-05 04:40:27.699 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_predict_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict
2024-08-05 04:40:27.701 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_column, val=pred_prob
2024-08-05 04:40:27.702 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_rank_column, val=pred_rank
2024-08-05 04:40:27.705 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:77 - Set Feature columns. ['distance', 'number', 'boxNum', 'odds', 'favorite', 'age', 'jweight', 'weight', 'gl', 'race_span', 'raceGrade', 'place_en', 'field_en', 'sex_en', 'condition_en', 'jockeyId_en', 'teacherId_en', 'dist_cat_en', 'horseId_en']
2024-08-05 04:40:27.706 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:79 - Set Objective columns. label_in1
2024-08-05 04:40:27.707 | INFO     | src.model_manager.lgbm_manager:save_root_model_info:281 - Save model params and dataset columns
2024-08-05 04:40:27.718 | INFO     | src.model_manager.lgbm_manager:train_all:262 - Training Start!
2024-08-05 04:40:27.719 | INFO     | src.model_manager.lgbm_manager:train_all:263 - ==================  train params  ========================
2024-08-05 04:40:27.720 | INFO     | src.model_manager.lgbm_manager:train_all:266 - boosting_type             =     gbdt
2024-08-05 04:40:27.722 | INFO     | src.model_manager.lgbm_manager:train_all:266 - objective                 =     binary
2024-08-05 04:40:27.723 | INFO     | src.model_manager.lgbm_manager:train_all:266 - metric                    =     auc
2024-08-05 04:40:27.724 | INFO     | src.model_manager.lgbm_manager:train_all:266 - verbose                   =     0
2024-08-05 04:40:27.725 | INFO     | src.model_manager.lgbm_manager:train_all:266 - seed                      =     77777
2024-08-05 04:40:27.726 | INFO     | src.model_manager.lgbm_manager:train_all:266 - learning_rate             =     0.01
2024-08-05 04:40:27.727 | INFO     | src.model_manager.lgbm_manager:train_all:266 - n_estimators              =     10000
2024-08-05 04:40:27.729 | INFO     | src.model_manager.lgbm_manager:train_all:267 - ==========================================================
2024-08-05 04:40:27.730 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019first
2024-08-05 04:40:27.732 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:40:27.846 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:40:27.848 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:40:27.974 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:40:33.296 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.892565	valid_1's auc: 0.843136
2024-08-05 04:40:42.059 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.924031	valid_1's auc: 0.84129
2024-08-05 04:40:42.744 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[18]	training's auc: 0.853563	valid_1's auc: 0.844379
2024-08-05 04:40:42.839 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2019first\model.params
2024-08-05 04:40:43.820 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019second
2024-08-05 04:40:43.821 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:40:43.937 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:40:43.939 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:40:44.073 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:40:49.800 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.8896	valid_1's auc: 0.809824
2024-08-05 04:40:59.331 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.919221	valid_1's auc: 0.807434
2024-08-05 04:41:01.670 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[57]	training's auc: 0.861228	valid_1's auc: 0.813443
2024-08-05 04:41:01.814 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2019second\model.params
2024-08-05 04:41:02.789 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020first
2024-08-05 04:41:02.790 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:41:02.906 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:41:02.908 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:41:03.055 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:41:09.359 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.886358	valid_1's auc: 0.833312
2024-08-05 04:41:20.143 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.915381	valid_1's auc: 0.829913
2024-08-05 04:41:22.641 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[55]	training's auc: 0.858773	valid_1's auc: 0.834047
2024-08-05 04:41:22.792 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2020first\model.params
2024-08-05 04:41:23.761 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020second
2024-08-05 04:41:23.762 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:41:23.889 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:41:23.890 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:41:24.028 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:41:30.955 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.884546	valid_1's auc: 0.807105
2024-08-05 04:41:43.111 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.912097	valid_1's auc: 0.803467
2024-08-05 04:41:48.789 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[110]	training's auc: 0.865402	valid_1's auc: 0.808919
2024-08-05 04:41:49.078 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2020second\model.params
2024-08-05 04:41:50.119 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021first
2024-08-05 04:41:50.120 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:41:50.232 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:41:50.233 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:41:50.397 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:41:57.801 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.881131	valid_1's auc: 0.821904
2024-08-05 04:42:10.714 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.909083	valid_1's auc: 0.819465
2024-08-05 04:42:11.873 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[20]	training's auc: 0.849044	valid_1's auc: 0.823702
2024-08-05 04:42:12.028 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2021first\model.params
2024-08-05 04:42:12.986 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021second
2024-08-05 04:42:12.988 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:42:13.158 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:42:13.160 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:42:13.323 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:42:21.193 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.87889	valid_1's auc: 0.824324
2024-08-05 04:42:35.249 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.906096	valid_1's auc: 0.822413
2024-08-05 04:42:36.818 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[25]	training's auc: 0.849061	valid_1's auc: 0.825301
2024-08-05 04:42:36.915 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2021second\model.params
2024-08-05 04:42:37.873 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022first
2024-08-05 04:42:37.875 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:42:38.017 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:42:38.019 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:42:38.184 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:42:46.663 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.87727	valid_1's auc: 0.830538
2024-08-05 04:43:02.141 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.903892	valid_1's auc: 0.82801
2024-08-05 04:43:06.585 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[67]	training's auc: 0.854664	valid_1's auc: 0.832685
2024-08-05 04:43:06.763 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2022first\model.params
2024-08-05 04:43:07.747 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022second
2024-08-05 04:43:07.748 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:43:07.912 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:43:07.913 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:43:08.090 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:43:17.035 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.875465	valid_1's auc: 0.831728
2024-08-05 04:43:33.706 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.90205	valid_1's auc: 0.829298
2024-08-05 04:43:34.023 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[4]	training's auc: 0.840408	valid_1's auc: 0.833176
2024-08-05 04:43:34.137 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2022second\model.params
2024-08-05 04:43:35.088 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023first
2024-08-05 04:43:35.090 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:43:35.295 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:43:35.297 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:43:35.480 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:43:44.918 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.874735	valid_1's auc: 0.827236
2024-08-05 04:44:02.301 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.900642	valid_1's auc: 0.824765
2024-08-05 04:44:07.800 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[72]	training's auc: 0.853329	valid_1's auc: 0.828611
2024-08-05 04:44:08.021 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2023first\model.params
2024-08-05 04:44:09.018 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023second
2024-08-05 04:44:09.019 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:44:09.179 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:44:09.181 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:44:09.369 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:44:19.292 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.872935	valid_1's auc: 0.825762
2024-08-05 04:44:37.925 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.899011	valid_1's auc: 0.823072
2024-08-05 04:44:40.339 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[29]	training's auc: 0.846909	valid_1's auc: 0.828874
2024-08-05 04:44:40.442 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\params\2023second\model.params
 
推論実行¶
In [24]:
for dataset_dict in dataset_grass.values():
    lgbm_model_manager.load_model(dataset_dict.name)
    lgbm_model_manager.predict(dataset_dict)

bet_mode = BetName.tan
bet_column = lgbm_model_manager.get_bet_column(bet_mode=bet_mode)
pl_column = lgbm_model_manager.get_profit_loss_column(bet_mode=bet_mode)
for dataset_dict in dataset_grass.values():
    lgbm_model_manager.set_bet_column(dataset_dict, bet_mode)
_, dfbetva_grass, dfbette_grass = lgbm_model_manager.merge_dataframe_data(
    dataset_grass, mode=True)
 
2024-08-05 04:44:42.359 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019first
2024-08-05 04:44:42.432 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019first
2024-08-05 04:44:42.939 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019first
2024-08-05 04:44:42.973 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2019first
2024-08-05 04:44:44.782 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019second
2024-08-05 04:44:44.860 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019second
2024-08-05 04:44:45.437 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019second
2024-08-05 04:44:45.473 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2019second
2024-08-05 04:44:47.360 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020first
2024-08-05 04:44:47.430 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020first
2024-08-05 04:44:48.026 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020first
2024-08-05 04:44:48.065 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2020first
2024-08-05 04:44:50.056 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020second
2024-08-05 04:44:50.146 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020second
2024-08-05 04:44:50.951 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020second
2024-08-05 04:44:50.993 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2020second
2024-08-05 04:44:52.983 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021first
2024-08-05 04:44:53.044 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021first
2024-08-05 04:44:53.686 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021first
2024-08-05 04:44:53.732 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2021first
2024-08-05 04:44:55.831 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021second
2024-08-05 04:44:55.904 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021second
2024-08-05 04:44:56.588 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021second
2024-08-05 04:44:56.636 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2021second
2024-08-05 04:44:58.825 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022first
2024-08-05 04:44:58.901 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022first
2024-08-05 04:44:59.676 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022first
2024-08-05 04:44:59.725 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2022first
2024-08-05 04:45:01.905 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022second
2024-08-05 04:45:01.954 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022second
2024-08-05 04:45:02.686 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022second
2024-08-05 04:45:02.739 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2022second
2024-08-05 04:45:05.062 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023first
2024-08-05 04:45:05.141 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023first
2024-08-05 04:45:06.069 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023first
2024-08-05 04:45:06.125 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2023first
2024-08-05 04:45:08.534 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023second
2024-08-05 04:45:08.592 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023second
2024-08-05 04:45:09.452 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023second
2024-08-05 04:45:09.512 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_grass2\analyze\00_predict\2023second
2024-08-05 04:45:11.028 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=bet_columns_map, val={'tan': 'bet_tan'}
2024-08-05 04:45:11.029 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=pl_column_map, val={'tan': 'pl_tan'}
 

ダートのみのモデル作成¶

 
学習実行¶
In [25]:
lgbm_model_manager_dirt = LightGBMModelManager(
    # ダートのみのモデルフォルダ作成
    root_dir / "models" / "model_only_dirt2",
    split_year,
    target_year,
    end_year
)
lgbm_model_manager_dirt.set_feature_and_objective_columns(
    feature_columns, objective_column)
lgbm_model_manager_dirt.topN = 1

lgbm_model_manager_dirt.train_all(
    params,
    dataset_dirt,
    stopping_rounds=500,  # ここで指定した値を超えるまでは、early stopさせない
    val_num=250  # ログを出力するスパン
)
 
2024-08-05 04:45:11.415 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_type, val=lightGBM
2024-08-05 04:45:11.417 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_id, val=model_only_dirt2
2024-08-05 04:45:11.418 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_dirt2
2024-08-05 04:45:11.419 | INFO     | src.model_manager.base_manager:__init__:43 - make directory. path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2
2024-08-05 04:45:11.424 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_analyze_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze
2024-08-05 04:45:11.425 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_predict_dir, val=e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict
2024-08-05 04:45:11.426 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_column, val=pred_prob
2024-08-05 04:45:11.427 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_rank_column, val=pred_rank
2024-08-05 04:45:11.431 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:77 - Set Feature columns. ['distance', 'number', 'boxNum', 'odds', 'favorite', 'age', 'jweight', 'weight', 'gl', 'race_span', 'raceGrade', 'place_en', 'field_en', 'sex_en', 'condition_en', 'jockeyId_en', 'teacherId_en', 'dist_cat_en', 'horseId_en']
2024-08-05 04:45:11.432 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:79 - Set Objective columns. label_in1
2024-08-05 04:45:11.433 | INFO     | src.model_manager.lgbm_manager:save_root_model_info:281 - Save model params and dataset columns
2024-08-05 04:45:11.440 | INFO     | src.model_manager.lgbm_manager:train_all:262 - Training Start!
2024-08-05 04:45:11.441 | INFO     | src.model_manager.lgbm_manager:train_all:263 - ==================  train params  ========================
2024-08-05 04:45:11.442 | INFO     | src.model_manager.lgbm_manager:train_all:266 - boosting_type             =     gbdt
2024-08-05 04:45:11.443 | INFO     | src.model_manager.lgbm_manager:train_all:266 - objective                 =     binary
2024-08-05 04:45:11.444 | INFO     | src.model_manager.lgbm_manager:train_all:266 - metric                    =     auc
2024-08-05 04:45:11.445 | INFO     | src.model_manager.lgbm_manager:train_all:266 - verbose                   =     0
2024-08-05 04:45:11.446 | INFO     | src.model_manager.lgbm_manager:train_all:266 - seed                      =     77777
2024-08-05 04:45:11.447 | INFO     | src.model_manager.lgbm_manager:train_all:266 - learning_rate             =     0.01
2024-08-05 04:45:11.449 | INFO     | src.model_manager.lgbm_manager:train_all:266 - n_estimators              =     10000
2024-08-05 04:45:11.450 | INFO     | src.model_manager.lgbm_manager:train_all:267 - ==========================================================
2024-08-05 04:45:11.451 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019first
2024-08-05 04:45:11.452 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:45:11.547 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:45:11.549 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:45:11.674 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:45:16.833 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.892287	valid_1's auc: 0.827398
2024-08-05 04:45:26.096 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.924317	valid_1's auc: 0.824905
2024-08-05 04:45:26.932 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[22]	training's auc: 0.853636	valid_1's auc: 0.8305
2024-08-05 04:45:27.021 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2019first\model.params
2024-08-05 04:45:27.977 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2019second
2024-08-05 04:45:27.978 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:45:28.121 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:45:28.122 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:45:28.258 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:45:34.009 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.889208	valid_1's auc: 0.820366
2024-08-05 04:45:44.360 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.920663	valid_1's auc: 0.818507
2024-08-05 04:45:45.422 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[25]	training's auc: 0.854107	valid_1's auc: 0.823117
2024-08-05 04:45:45.546 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2019second\model.params
2024-08-05 04:45:46.523 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020first
2024-08-05 04:45:46.525 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:45:46.691 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:45:46.693 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:45:46.834 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:45:53.226 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.884969	valid_1's auc: 0.825236
2024-08-05 04:46:04.998 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.915289	valid_1's auc: 0.822479
2024-08-05 04:46:06.446 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[30]	training's auc: 0.852515	valid_1's auc: 0.827618
2024-08-05 04:46:06.542 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2020first\model.params
2024-08-05 04:46:07.500 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2020second
2024-08-05 04:46:07.501 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:46:07.629 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:46:07.631 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:46:07.778 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:46:14.617 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.883667	valid_1's auc: 0.832062
2024-08-05 04:46:27.364 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.912794	valid_1's auc: 0.830017
2024-08-05 04:46:29.057 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[31]	training's auc: 0.852393	valid_1's auc: 0.834074
2024-08-05 04:46:29.176 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2020second\model.params
2024-08-05 04:46:30.138 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021first
2024-08-05 04:46:30.139 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:46:30.321 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:46:30.323 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:46:30.476 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:46:37.870 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.881116	valid_1's auc: 0.814129
2024-08-05 04:46:51.973 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.909153	valid_1's auc: 0.812765
2024-08-05 04:46:59.715 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[132]	training's auc: 0.865583	valid_1's auc: 0.814922
2024-08-05 04:47:00.035 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2021first\model.params
2024-08-05 04:47:01.105 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2021second
2024-08-05 04:47:01.107 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:47:01.239 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:47:01.241 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:47:01.402 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:47:09.257 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.879119	valid_1's auc: 0.840663
2024-08-05 04:47:24.520 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.906463	valid_1's auc: 0.837274
2024-08-05 04:47:27.283 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[43]	training's auc: 0.852409	valid_1's auc: 0.842359
2024-08-05 04:47:27.383 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2021second\model.params
2024-08-05 04:47:28.353 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022first
2024-08-05 04:47:28.354 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:47:28.524 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:47:28.525 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:47:28.693 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:47:37.070 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.876844	valid_1's auc: 0.816646
2024-08-05 04:47:53.754 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.903971	valid_1's auc: 0.814472
2024-08-05 04:47:56.347 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[35]	training's auc: 0.850723	valid_1's auc: 0.8197
2024-08-05 04:47:56.482 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2022first\model.params
2024-08-05 04:47:57.450 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2022second
2024-08-05 04:47:57.451 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:47:57.624 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:47:57.625 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:47:57.803 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:48:06.762 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.875403	valid_1's auc: 0.848327
2024-08-05 04:48:24.699 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.902316	valid_1's auc: 0.846846
2024-08-05 04:48:36.450 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[159]	training's auc: 0.864476	valid_1's auc: 0.848732
2024-08-05 04:48:36.901 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2022second\model.params
2024-08-05 04:48:38.005 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023first
2024-08-05 04:48:38.006 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:48:38.182 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:48:38.184 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:48:38.367 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:48:47.998 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.874736	valid_1's auc: 0.849187
2024-08-05 04:49:07.635 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.900896	valid_1's auc: 0.847178
2024-08-05 04:49:17.504 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[115]	training's auc: 0.859439	valid_1's auc: 0.849603
2024-08-05 04:49:17.794 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2023first\model.params
2024-08-05 04:49:18.839 | INFO     | src.model_manager.lgbm_manager:train_all:271 - Start training model. model name: 2023second
2024-08-05 04:49:18.841 | WARNING  | lightgbm.basic:_log_warning:195 - Found `n_estimators` in params. Will use it instead of argument
2024-08-05 04:49:19.033 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] Categorical features with more bins than the configured maximum bin number found.
2024-08-05 04:49:19.035 | INFO     | lightgbm.basic:_log_native:200 - [LightGBM] [Warning] For categorical features, max_bin and max_bin_by_feature may be ignored with a large number of categories.
2024-08-05 04:49:19.220 | INFO     | lightgbm.basic:_log_info:191 - Training until validation scores don't improve for 500 rounds
2024-08-05 04:49:29.269 | INFO     | lightgbm.basic:_log_info:191 - [250]	training's auc: 0.874521	valid_1's auc: 0.842834
2024-08-05 04:49:49.601 | INFO     | lightgbm.basic:_log_info:191 - [500]	training's auc: 0.900463	valid_1's auc: 0.840593
2024-08-05 04:49:50.647 | INFO     | lightgbm.basic:_log_info:191 - Early stopping, best iteration is:
[11]	training's auc: 0.845526	valid_1's auc: 0.844109
2024-08-05 04:49:50.770 | INFO     | src.model_manager.lgbm_manager:save_model:222 - Saving model... model path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\params\2023second\model.params
 
推論実行¶
In [26]:
for dataset_dict in dataset_dirt.values():
    lgbm_model_manager_dirt.load_model(dataset_dict.name)
    lgbm_model_manager_dirt.predict(dataset_dict)

bet_mode = BetName.tan
bet_column = lgbm_model_manager_dirt.get_bet_column(bet_mode=bet_mode)
pl_column = lgbm_model_manager_dirt.get_profit_loss_column(bet_mode=bet_mode)
for dataset_dict in dataset_dirt.values():
    lgbm_model_manager_dirt.set_bet_column(dataset_dict, bet_mode)
_, dfbetva_dirt, dfbette_dirt = lgbm_model_manager_dirt.merge_dataframe_data(
    dataset_dirt, mode=True)
 
2024-08-05 04:49:52.687 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019first
2024-08-05 04:49:52.756 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019first
2024-08-05 04:49:53.605 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019first
2024-08-05 04:49:53.667 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2019first
2024-08-05 04:49:56.429 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2019second
2024-08-05 04:49:56.510 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2019second
2024-08-05 04:49:57.515 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2019second
2024-08-05 04:49:57.578 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2019second
2024-08-05 04:50:00.346 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020first
2024-08-05 04:50:00.420 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020first
2024-08-05 04:50:01.517 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020first
2024-08-05 04:50:01.583 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2020first
2024-08-05 04:50:04.372 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2020second
2024-08-05 04:50:04.424 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2020second
2024-08-05 04:50:05.614 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2020second
2024-08-05 04:50:05.691 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2020second
2024-08-05 04:50:08.954 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021first
2024-08-05 04:50:09.038 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021first
2024-08-05 04:50:10.874 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021first
2024-08-05 04:50:10.946 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2021first
2024-08-05 04:50:14.144 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2021second
2024-08-05 04:50:14.199 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2021second
2024-08-05 04:50:15.646 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2021second
2024-08-05 04:50:15.723 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2021second
2024-08-05 04:50:18.976 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022first
2024-08-05 04:50:19.027 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022first
2024-08-05 04:50:20.616 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022first
2024-08-05 04:50:20.685 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2022first
2024-08-05 04:50:24.192 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2022second
2024-08-05 04:50:24.285 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2022second
2024-08-05 04:50:26.743 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2022second
2024-08-05 04:50:26.826 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2022second
2024-08-05 04:50:30.574 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023first
2024-08-05 04:50:30.656 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023first
2024-08-05 04:50:32.865 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023first
2024-08-05 04:50:32.940 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2023first
2024-08-05 04:50:36.891 | INFO     | src.model_manager.lgbm_manager:load_model:201 - Loading model... model name: 2023second
2024-08-05 04:50:36.961 | INFO     | src.model_manager.lgbm_manager:load_model:203 - model activate! model_name: 2023second
2024-08-05 04:50:38.869 | INFO     | src.model_manager.base_manager:set_predict_dataframe:379 - Set the infered DataFrame into the dataset. model_name: 2023second
2024-08-05 04:50:38.956 | INFO     | src.model_manager.base_manager:save_predict_result:406 - Save predict result. save path: e:\dev_um_ai\dev-um-ai\models\model_only_dirt2\analyze\00_predict\2023second
2024-08-05 04:50:42.097 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=bet_columns_map, val={'tan': 'bet_tan'}
2024-08-05 04:50:42.098 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=pl_column_map, val={'tan': 'pl_tan'}
 

モデル情報のエクスポート¶

 

ここが大変でWEBアプリでインポート出来るように、芝とダートの結果をマージする必要がある

In [27]:
lgbm_model_manager_merge = LightGBMModelManager(
    # マージ用のモデルフォルダを作成
    root_dir / "models" / "model_field_div_part2",
    split_year,
    target_year,
    end_year
)
lgbm_model_manager_merge.set_feature_and_objective_columns(
    feature_columns, objective_column)
lgbm_model_manager_merge.topN = 1
lgbm_model_manager_merge.expt_info_map.bet_columns_map = {"tan": "bet_tan"}
lgbm_model_manager_merge.expt_info_map.pl_column_map = {"tan": "pl_tan"}
lgbm_model_manager_merge.expt_info_map.confidence_column = "pred_prob"
lgbm_model_manager_merge.expt_info_map.confidence_rank_column = "pred_rank"
 
2024-08-05 04:50:42.721 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_type, val=lightGBM
2024-08-05 04:50:42.722 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_id, val=model_grass_dirt_part2
2024-08-05 04:50:42.724 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_dir, val=e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part2
2024-08-05 04:50:42.725 | INFO     | src.model_manager.base_manager:__init__:43 - make directory. path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part2
2024-08-05 04:50:42.728 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_analyze_dir, val=e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part2\analyze
2024-08-05 04:50:42.729 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=model_predict_dir, val=e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part2\analyze\00_predict
2024-08-05 04:50:42.731 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_column, val=pred_prob
2024-08-05 04:50:42.732 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=confidence_rank_column, val=pred_rank
2024-08-05 04:50:42.734 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:77 - Set Feature columns. ['distance', 'number', 'boxNum', 'odds', 'favorite', 'age', 'jweight', 'weight', 'gl', 'race_span', 'raceGrade', 'place_en', 'field_en', 'sex_en', 'condition_en', 'jockeyId_en', 'teacherId_en', 'dist_cat_en', 'horseId_en']
2024-08-05 04:50:42.735 | INFO     | src.data_manager.dataset_tools:set_feature_and_objective_columns:79 - Set Objective columns. label_in1
 

収支の計算

In [28]:
dfbetva, dfbette = \
    pd.concat(
        [dfbetva_dirt, dfbetva_grass]
    ).sort_values(["raceDate", "raceId"], ignore_index=True), \
    pd.concat(
        [dfbette_dirt, dfbette_grass]
    ).sort_values(["raceDate", "raceId"], ignore_index=True)

dfbetva, dfbette = lgbm_model_manager_merge.generate_profit_loss(
    dfbetva, dfbette, bet_mode)
 
2024-08-05 04:50:43.316 | INFO     | src.model_manager.base_manager:__save_profit_loss:646 - Save profit loss data. save_path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part2\analyze\tan\profit_loss
2024-08-05 04:50:43.318 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=profit_loss_dir, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part2\\analyze\\tan\\profit_loss'}
 

基礎情報の計算

In [29]:
dataset_merge = {}
for (key, dset_g), dset_d in zip(dataset_grass.items(), dataset_dirt):
    dset_m = LightGBMDataset(key, None, None, None)
    dset_m.pred_train = pd.concat(
        [dset_g.pred_train, dset_g.pred_train]).sort_values(
            ["raceDate", "raceId"], ignore_index=True)
    dset_m.pred_test = pd.concat(
        [dset_g.pred_test, dset_g.pred_test]).sort_values(
            ["raceDate", "raceId"], ignore_index=True)
    dset_m.pred_valid = pd.concat(
        [dset_g.pred_valid, dset_g.pred_valid]).sort_values(
            ["raceDate", "raceId"], ignore_index=True)
    dataset_merge[key] = dset_m


lgbm_model_manager_merge.model_name_list = lgbm_model_manager.model_name_list
lgbm_model_manager_merge.basic_analyze(dataset_merge)
 
2024-08-05 04:50:45.275 | INFO     | src.model_manager.base_manager:basic_analyze:220 - Start basic analyze.
2024-08-05 04:50:45.716 | INFO     | src.model_manager.base_manager:basic_analyze:256 - Saving Return And Hit Rate Summary.
2024-08-05 04:50:45.727 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=return_hit_rate_file, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part2\\analyze\\tan\\hit_and_return_rate.csv'}
2024-08-05 04:50:45.728 | INFO     | src.model_manager.base_manager:basic_analyze:259 - Saving Favorite Bet Num Summary.
2024-08-05 04:50:45.756 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=fav_bet_num_dir, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part2\\analyze\\tan\\fav_bet_num'}
 

オッズグラフの作成

In [30]:
dft_g, dfv_g, dfte_g = lgbm_model_manager.merge_dataframe_data(
    dataset_grass,
    mode=True
)

dft_d, dfv_d, dfte_d = lgbm_model_manager_dirt.merge_dataframe_data(
    dataset_dirt,
    mode=True
)

dft_m, dfv_m, dfte_m = \
    pd.concat([dft_g, dft_d]), pd.concat(
        [dfv_g, dfv_d]), pd.concat([dfte_g, dfte_d])

summary_dict = lgbm_model_manager_merge.gegnerate_odds_graph(
    dft_m, dfv_m, dfte_m, bet_mode)
 
2024-08-05 04:50:47.856 | INFO     | src.model_manager.base_manager:__save_odds_graph:514 - Save Odds Graph. save_path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part2\analyze\tan\odds_graph
2024-08-05 04:50:47.856 | INFO     | src.model_manager.base_manager:set_keyvalue_to_export_mapping:139 - Set Export info. key=odds_graph_file, val={'tan': 'e:\\dev_um_ai\\dev-um-ai\\models\\model_grass_dirt_part2\\analyze\\tan\\odds_graph'}
 

モデル情報のエクスポート

In [31]:
lgbm_model_manager_merge.export_model_info()
 
2024-08-05 04:50:47.880 | INFO     | src.model_manager.base_manager:export_model_info:848 - Export Model info json. export path: e:\dev_um_ai\dev-um-ai\models\model_grass_dirt_part2\model_info.json
 

4-5-3.性能の確認(WEBアプリ起動)¶

In [32]:
! python ../app_keiba/manage.py makemigrations
! python ../app_keiba/manage.py migrate 
! echo server launch OK
# ! python ../app_keiba/manage.py runserver 12345
 
No changes detected
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, model_analyzer, sessions
Running migrations:
  No migrations to apply.
server launch OK
 

「server launch OK」の表示がでたら以下のリンクをクリックしてWEBアプリへアクセス

http://localhost:12345/index.html

停止する場合は、セルの中断ボタンを押下

 
  モデルID 支持率OGS 回収率OGS AonBOGS
1 model_field_div_part2 0.56458 -7.49332 0.15654
2 model_field_div_part1 0.38790 -7.67658 -0.02672
3 first_model
(baseline)
0.41924 -7.64492  
 
スポンサーリンク

4-6.結論¶

 

結論: 芝とダートでモデルは分けるべきかも

 

結果から、AonB OGSでみるとパート2で作成したモデル、つまり芝とダートで完全に分けて学習したモデルが最も性能が良い。
また、特徴的なのがパート1で作成したモデル、つまり検証データとテストデータで芝とダートでデータを分けた場合の方が、ファーストモデルよりも性能が低い結果になっている。
つまりダート(/芝)のレースを推論する上で学習データには芝(/ダート)レースの情報が不要であることを意味している。

とはいえ前走データをまだ考慮できていないこともあり、結論を出すのは時期早々感が否めない。
正直、芝とダートでモデルを分けて学習する場合、既存ソースの改造がえげつないので判断を慎重にしたい

しかし、結果を見るとファーストモデルと比較してAonB OGSで0.15654の改善がされている。
これを無視するのはやや無理がある印象。

そのため、ゆくゆくは芝とダートでモデルを分けて学習する方式をとるようにソースを改善する。
もしかしたら、競馬場別、距離カテゴリ別と細分化していく方がモデルの性能は上がっていくのかもしれない。
そのような芝とダートのみならず汎用的なモデル分割ができるように、改善を行うタイミングは血統と前走データを考慮したモデルを作成してから再度判断することとする。
その際には、今回のNotebookをもう少し改善し、馬場別、競馬場別、距離カテゴリ別、馬場×競馬場別、馬場×競馬場×距離カテゴリ別など様々な組み合わせに対しても集計できるようにしておく。

コメント

タイトルとURLをコピーしました