|
Post by Ian Noble on Jan 8, 2020 21:40:41 GMT
I'm starting to plan how my own sim engine might work. I'm looking for input from the community on what it might look like. For instance: - A player's IRL 3PT% should determine how well they shoot the 3 in the sim engine, with a predetermined standard deviation perhaps determined by a player's own career shooting standard deviation. Same for FG% and FT%.
- Injuries could be handled in two different ways: (1) they happen when a player's IRL injury happens or (2) a player's injury probability depends on how frequently they've been injured IRL in the past, but an IRL injury does not mean a player is also out.
Frankly I should probably take a look for a few open source examples I can start from and work from there, but nevertheless any input here is helpful. I'm especially interested in (1) how you guys think advanced stats might provide a guide, since I don't spend that much time pouring over advanced stats personally, but the sim engine could adapt on a daily basis to changes by polling live APIs and (2) how Usage should be determined on a team full of stars like D5 Charlotte (Lebron, PG, Westbrook, Middleton)- ie. a guy like Lebron should always have high USG no matter what team he is on, but a guy like Khris Middleton should have his USG severely impacted by playing on a team like D5 CHA. Lastly, please don't expect this to happen any time soon, like it could take me years depending on how much time I get.
|
|
|
Post by George Gervin on Jan 8, 2020 22:08:31 GMT
Personally on the advanced stats front, as someone who pours over them...they can be as misleading as they are insightful. Some— such as BPM — are really simple and easily translatable as far as impact. Win Shares is also something else that would inform where a player should stand overall. However— especially when it comes to defensive stats— advanced measures aren’t everything. For example, if you look at the top 10 this year by advanced defensive metrics— particularly defensive rating— there’s guys like DeAndre Jordan, Brook Lopez, and Donte DiVincenzo in that group. Would I say those should be guys rated as top 10 defensive players in D5? Absolutely not. But it goes to show advanced stats can be helpful inputs, but the altar shouldn’t be worshipped when implementing the critical elements of a sim engine.
On the injury front, I think the injury probability makes it closer to how 2k and most sports video games calculate injury risk. It may be easier than trying to do it in real time— with the exception of season ending type injuries in real life.
|
|
|
Post by Hanamichi Sakuragi on Jan 8, 2020 22:58:57 GMT
We are willing to wait Ian. We trust you.
|
|
|
Post by Alex English on Jan 9, 2020 5:46:08 GMT
I've thought a bit about this while bored at work before, even playing around with excel to make a very simple simulation. I think the way NBA Live works is dumb because the stats don't actually mean anything. The resulted is determined and then statistics are attributed afterward. It's a top-down approach, which I think is rubbish. A truly good sim engine should be bottom-up by simulating each possession. The final result would be organic since it'd just be the product of each individual possession added together. Stats would mean something and individual player attributes would be very important.
The way I think it could work would be to use a random number generator and a flow chart that walks through each possession step-by-step.
Just taking a rough average from last season here's how every possession ends up: 72% - Field goal attempt (64% 2-point attempt (52% makes), 36% 3-point attempt (35% makes)) (77% DReb, 23% OReb) 11% - Turnover 9% - Shooting foul (77% makes) 8% - Non-shooting foul
So the most basic form of this idea would be something like this:
- Run RNG for jump ball (50/50) > Team A wins
- Run RNG based on above probabilities (72%/11%/9%/8%) > Field goal attempt - Run RNG for attempt (64%/36%) > 2-point attempt - Run RNG for success rate (52%/48%) > Misses attempt - Run RNG for rebound (77%/23%) > Defensive rebound - End of possession > Team B gets ball
- Run RNG based on above probabilities (72%/11%/9%/8%) > Turnover - End of possession > Team A gets ball
- Run RNG based on above probabilities (72%/11%/9%/8%) > Shooting foul - Run RNG for first free throw (77%/23%) > Makes free throw (+1 score for Team A) - Run RNG for second free throw (77%/23%) > Makes free throw (+1 score for Team A) - End of possession > Team B gets ball
- Run RNG based on above probabilities (72%/11%/9%/8%) > Non-shooting foul - End of possession > Team B retains ball
- Run RNG based on above probabilities (72%/11%/9%/8%) > Field goal attempt - Run RNG for attempt (64%/36%) > 3-point attempt - Run RNG for success rate (35%/65%) > Makes attempt (+3 score for Team B) - End of possession > Team A gets ball
- Run RNG based on above probabilities (72%/11%/9%/8%) > Field goal attempt - Run RNG for attempt (64%/36%) > 2-point attempt - Run RNG for success rate (52%/48%) > Misses attempt - Run RNG for rebound (77%/23%) > Offensive rebound - End of possession > Team A retains ball
- Run RNG based on above probabilities (72%/11%/9%/8%) > Shooting foul - Run RNG for first free throw (77%/23%) > Makes free throw (+1 score for Team A) - Run RNG for second free throw (77%/23%) > Misses free throw - Run RNG for rebound (77%/23%) > Defensive rebound - End of possession > Team B gets ball
After 7 possessions the game is tied 3-3. Do this 100 more times and you have a full game.
That's about as simple as you could make it. It gets a million times more complicated once you add in a step for which player uses the possession (based on player usage ratings), then change the default possession outcome probabilities based on player attributes, then take into account if the possession is after a defensive rebound (transition attempt = higher probability of scoring), then take into account the length of the possession (longer possessions = lower probability of scoring), then take into account opponent's defensive attributes, and so on, and so on...
I don't have too much coding experience, but I think the end result would be like a thousand nested if statements. The sim engine would just work it's way through the flow chart one possession at a time.
|
|
|
Post by Ian Noble on May 15, 2020 11:45:43 GMT
It's taken me ages to get back to this but Alex English - that is exactly what I had in mind, and I like the terminology: Simming from the bottom upwards. And personally I would hope to take into account individual possession matchups as you described here: It gets a million times more complicated once you add in a step for which player uses the possession (based on player usage ratings), then change the default possession outcome probabilities based on player attributes, then take into account if the possession is after a defensive rebound (transition attempt = higher probability of scoring), then take into account the length of the possession (longer possessions = lower probability of scoring), then take into account opponent's defensive attributes, and so on, and so on... I don't have too much coding experience, but I think the end result would be like a thousand nested if statements. The sim engine would just work it's way through the flow chart one possession at a time.
|
|
|
Post by Ian Noble on May 15, 2020 11:47:54 GMT
For me the development of all of this begins with scouring basketball-reference.com's API so we have some understanding of universal variables to use, and further expanding it from just a game-by-game sim and including injuries and possibly even team chemistry.
|
|
Billy King
Former Jazz and Knicks GM
Rookie
Posts: 247
Oct 30, 2023 14:13:22 GMT
|
Post by Billy King on May 15, 2020 12:30:19 GMT
For me the development of all of this begins with scouring basketball-reference.com's API so we have some understanding of universal variables to use, and further expanding it from just a game-by-game sim and including injuries and possibly even team chemistry.
I just now saw this. I think the biggest thing that other engines lack... is separating volume from efficiency. You need a stat that can just be "how often does this player do X" maybe a rating for AST%, BLK%, USG% etc, but then ALSO separate ratings for how good they are.
This little change would make your sim engine the best one, even if everything else stayed the same.
Good luck Ian!
|
|
|
Post by Ian Noble on May 15, 2020 12:57:16 GMT
For me the development of all of this begins with scouring basketball-reference.com's API so we have some understanding of universal variables to use, and further expanding it from just a game-by-game sim and including injuries and possibly even team chemistry. I just now saw this. I think the biggest thing that other engines lack... is separating volume from efficiency. You need a stat that can just be "how often does this player do X" maybe a rating for AST%, BLK%, USG% etc, but then ALSO separate ratings for how good they are.
This little change would make your sim engine the best one, even if everything else stayed the same. Good luck Ian!
Yeah this is a great distinction to make!
|
|
Billy King
Former Jazz and Knicks GM
Rookie
Posts: 247
Oct 30, 2023 14:13:22 GMT
|
Post by Billy King on May 15, 2020 15:21:21 GMT
I just now saw this. I think the biggest thing that other engines lack... is separating volume from efficiency. You need a stat that can just be "how often does this player do X" maybe a rating for AST%, BLK%, USG% etc, but then ALSO separate ratings for how good they are.
This little change would make your sim engine the best one, even if everything else stayed the same. Good luck Ian!
Yeah this is a great distinction to make! re-reading my post it doesn't make sense for ast% etc., because ast% is directly correlated to volume, the same with rebounds or blocks.
It's specifically efficiency and shooting volume that need separated.
|
|