Dice sums comparison probability











up vote
3
down vote

favorite












I was playing a fun little board game called U.S. Patent No. 1 about time traveling to register your time machine as the first U.S. patent. I started to wonder what the dice probabilities were for basic attacks. In the game, you rolls 2 dice to attack, and your opponent rolls 1 die for defense. If the attack result is greater than the defense, then a time machine part is "disabled", and if it is greater by 5 or more, than it is "destroyed"; otherwise it "misses". However, this is not the end: players can also have attack and defense bonuses ranging from 0-12, which are added to the roll result before making the comparison.



To begin my program, I first did a straightforward brute force calculation, where each number in the array was a probability out of 216, and attack and defense bonus from 0 through 12 were cycled through.



But this was a bit slower than I wanted (~500ms, not terrible), so I decided to utilize a bit more math and hardcode the probability distribution of two dice to use multiplication instead of addition. I also replaced the dict with enum keys with a list where 0 is "MISSED", 1 is "DISABLED", and 2 is "DESTROYED", which seemed to significantly reduce overhead by my timings (from ~200ms to ~50ms).



But then I looked more carefully at my results and realized something that is in retrospect obvious about probability distributions of this nature: adding one to both the attack and defense bonus means the bonuses have no affect on the distribution. Represented symbolically, the distribution for A, D is always going to be the same as the distribution for A + N, D + N for integers N. So instead of calculating separate values for attack and defense bonuses, I can just calculate a raw value for attack shift (which also reformatted my array to be smaller and have different dimensions):



from itertools import product
import numpy as np

def calculate():
two_dice_prob = [0, *range(6), *range(6,0,-1)]
result = np.zeros((25,3))
for attack_shift in range(-12, 13):
result_counts = [0] * 3
for defense, attack in product(range(1,7), range(2,13)):
base_attack = attack
attack += attack_shift
if attack - 5 >= defense:
result_counts[2] += two_dice_prob[base_attack]
elif attack > defense:
result_counts[1] += two_dice_prob[base_attack]
else:
result_counts[0] += two_dice_prob[base_attack]
result[attack_shift+12] = result_counts
return result

if __name__ == '__main__':
print("Result:")
print(calculate())


The result is a numpy array where the top is an attack with a penalty of -12, and the bottom is the attack with a bonus of 12, and every other row has an attack bonus of 1 more than the previous row. By merit of processing less cases, it now only takes ~6ms.



I was wondering how I could optimize the calculate function even more. I suspect I am not utilizing numpy's efficiency fully, since I am still not completely familiar with it.










share|improve this question
























  • How do 'bonuses' work? Is it whatever you roll plus it?
    – Peilonrayz
    18 hours ago










  • @Peilonrayz Yes; I'll make an edit.
    – Graham
    17 hours ago















up vote
3
down vote

favorite












I was playing a fun little board game called U.S. Patent No. 1 about time traveling to register your time machine as the first U.S. patent. I started to wonder what the dice probabilities were for basic attacks. In the game, you rolls 2 dice to attack, and your opponent rolls 1 die for defense. If the attack result is greater than the defense, then a time machine part is "disabled", and if it is greater by 5 or more, than it is "destroyed"; otherwise it "misses". However, this is not the end: players can also have attack and defense bonuses ranging from 0-12, which are added to the roll result before making the comparison.



To begin my program, I first did a straightforward brute force calculation, where each number in the array was a probability out of 216, and attack and defense bonus from 0 through 12 were cycled through.



But this was a bit slower than I wanted (~500ms, not terrible), so I decided to utilize a bit more math and hardcode the probability distribution of two dice to use multiplication instead of addition. I also replaced the dict with enum keys with a list where 0 is "MISSED", 1 is "DISABLED", and 2 is "DESTROYED", which seemed to significantly reduce overhead by my timings (from ~200ms to ~50ms).



But then I looked more carefully at my results and realized something that is in retrospect obvious about probability distributions of this nature: adding one to both the attack and defense bonus means the bonuses have no affect on the distribution. Represented symbolically, the distribution for A, D is always going to be the same as the distribution for A + N, D + N for integers N. So instead of calculating separate values for attack and defense bonuses, I can just calculate a raw value for attack shift (which also reformatted my array to be smaller and have different dimensions):



from itertools import product
import numpy as np

def calculate():
two_dice_prob = [0, *range(6), *range(6,0,-1)]
result = np.zeros((25,3))
for attack_shift in range(-12, 13):
result_counts = [0] * 3
for defense, attack in product(range(1,7), range(2,13)):
base_attack = attack
attack += attack_shift
if attack - 5 >= defense:
result_counts[2] += two_dice_prob[base_attack]
elif attack > defense:
result_counts[1] += two_dice_prob[base_attack]
else:
result_counts[0] += two_dice_prob[base_attack]
result[attack_shift+12] = result_counts
return result

if __name__ == '__main__':
print("Result:")
print(calculate())


The result is a numpy array where the top is an attack with a penalty of -12, and the bottom is the attack with a bonus of 12, and every other row has an attack bonus of 1 more than the previous row. By merit of processing less cases, it now only takes ~6ms.



I was wondering how I could optimize the calculate function even more. I suspect I am not utilizing numpy's efficiency fully, since I am still not completely familiar with it.










share|improve this question
























  • How do 'bonuses' work? Is it whatever you roll plus it?
    – Peilonrayz
    18 hours ago










  • @Peilonrayz Yes; I'll make an edit.
    – Graham
    17 hours ago













up vote
3
down vote

favorite









up vote
3
down vote

favorite











I was playing a fun little board game called U.S. Patent No. 1 about time traveling to register your time machine as the first U.S. patent. I started to wonder what the dice probabilities were for basic attacks. In the game, you rolls 2 dice to attack, and your opponent rolls 1 die for defense. If the attack result is greater than the defense, then a time machine part is "disabled", and if it is greater by 5 or more, than it is "destroyed"; otherwise it "misses". However, this is not the end: players can also have attack and defense bonuses ranging from 0-12, which are added to the roll result before making the comparison.



To begin my program, I first did a straightforward brute force calculation, where each number in the array was a probability out of 216, and attack and defense bonus from 0 through 12 were cycled through.



But this was a bit slower than I wanted (~500ms, not terrible), so I decided to utilize a bit more math and hardcode the probability distribution of two dice to use multiplication instead of addition. I also replaced the dict with enum keys with a list where 0 is "MISSED", 1 is "DISABLED", and 2 is "DESTROYED", which seemed to significantly reduce overhead by my timings (from ~200ms to ~50ms).



But then I looked more carefully at my results and realized something that is in retrospect obvious about probability distributions of this nature: adding one to both the attack and defense bonus means the bonuses have no affect on the distribution. Represented symbolically, the distribution for A, D is always going to be the same as the distribution for A + N, D + N for integers N. So instead of calculating separate values for attack and defense bonuses, I can just calculate a raw value for attack shift (which also reformatted my array to be smaller and have different dimensions):



from itertools import product
import numpy as np

def calculate():
two_dice_prob = [0, *range(6), *range(6,0,-1)]
result = np.zeros((25,3))
for attack_shift in range(-12, 13):
result_counts = [0] * 3
for defense, attack in product(range(1,7), range(2,13)):
base_attack = attack
attack += attack_shift
if attack - 5 >= defense:
result_counts[2] += two_dice_prob[base_attack]
elif attack > defense:
result_counts[1] += two_dice_prob[base_attack]
else:
result_counts[0] += two_dice_prob[base_attack]
result[attack_shift+12] = result_counts
return result

if __name__ == '__main__':
print("Result:")
print(calculate())


The result is a numpy array where the top is an attack with a penalty of -12, and the bottom is the attack with a bonus of 12, and every other row has an attack bonus of 1 more than the previous row. By merit of processing less cases, it now only takes ~6ms.



I was wondering how I could optimize the calculate function even more. I suspect I am not utilizing numpy's efficiency fully, since I am still not completely familiar with it.










share|improve this question















I was playing a fun little board game called U.S. Patent No. 1 about time traveling to register your time machine as the first U.S. patent. I started to wonder what the dice probabilities were for basic attacks. In the game, you rolls 2 dice to attack, and your opponent rolls 1 die for defense. If the attack result is greater than the defense, then a time machine part is "disabled", and if it is greater by 5 or more, than it is "destroyed"; otherwise it "misses". However, this is not the end: players can also have attack and defense bonuses ranging from 0-12, which are added to the roll result before making the comparison.



To begin my program, I first did a straightforward brute force calculation, where each number in the array was a probability out of 216, and attack and defense bonus from 0 through 12 were cycled through.



But this was a bit slower than I wanted (~500ms, not terrible), so I decided to utilize a bit more math and hardcode the probability distribution of two dice to use multiplication instead of addition. I also replaced the dict with enum keys with a list where 0 is "MISSED", 1 is "DISABLED", and 2 is "DESTROYED", which seemed to significantly reduce overhead by my timings (from ~200ms to ~50ms).



But then I looked more carefully at my results and realized something that is in retrospect obvious about probability distributions of this nature: adding one to both the attack and defense bonus means the bonuses have no affect on the distribution. Represented symbolically, the distribution for A, D is always going to be the same as the distribution for A + N, D + N for integers N. So instead of calculating separate values for attack and defense bonuses, I can just calculate a raw value for attack shift (which also reformatted my array to be smaller and have different dimensions):



from itertools import product
import numpy as np

def calculate():
two_dice_prob = [0, *range(6), *range(6,0,-1)]
result = np.zeros((25,3))
for attack_shift in range(-12, 13):
result_counts = [0] * 3
for defense, attack in product(range(1,7), range(2,13)):
base_attack = attack
attack += attack_shift
if attack - 5 >= defense:
result_counts[2] += two_dice_prob[base_attack]
elif attack > defense:
result_counts[1] += two_dice_prob[base_attack]
else:
result_counts[0] += two_dice_prob[base_attack]
result[attack_shift+12] = result_counts
return result

if __name__ == '__main__':
print("Result:")
print(calculate())


The result is a numpy array where the top is an attack with a penalty of -12, and the bottom is the attack with a bonus of 12, and every other row has an attack bonus of 1 more than the previous row. By merit of processing less cases, it now only takes ~6ms.



I was wondering how I could optimize the calculate function even more. I suspect I am not utilizing numpy's efficiency fully, since I am still not completely familiar with it.







python python-3.x numpy






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 10 hours ago

























asked 18 hours ago









Graham

566113




566113












  • How do 'bonuses' work? Is it whatever you roll plus it?
    – Peilonrayz
    18 hours ago










  • @Peilonrayz Yes; I'll make an edit.
    – Graham
    17 hours ago


















  • How do 'bonuses' work? Is it whatever you roll plus it?
    – Peilonrayz
    18 hours ago










  • @Peilonrayz Yes; I'll make an edit.
    – Graham
    17 hours ago
















How do 'bonuses' work? Is it whatever you roll plus it?
– Peilonrayz
18 hours ago




How do 'bonuses' work? Is it whatever you roll plus it?
– Peilonrayz
18 hours ago












@Peilonrayz Yes; I'll make an edit.
– Graham
17 hours ago




@Peilonrayz Yes; I'll make an edit.
– Graham
17 hours ago










1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










1. Review




  1. The name calculate is vague. Something like attack_freq would be more specific.


  2. There's no docstring. What does calculate do? What does it return?


  3. The values in two_dice_prob are not probabilities, they are counts or frequencies. (To get probabilities, you'd have to divide by 36.) I would use a name like two_dice_freq.


  4. By default numpy.zeros gives you an array of floats, but the result array only contains integers, so you could specify dtype=int when creating it. (Alternatively, by constructing the result using NumPy throughout, it can be arranged that it has the right data type.)



  5. When working with NumPy it's nearly always fastest if you structure the code to consist of a sequence of whole-array operations, rather than looping over the elements in native Python.



    In this case we can use numpy.mgrid to construct arrays containing all possibilities for attack shift, defence roll and attack roll:



    shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]


    Then we can find the difference between attack and defence simultaneously for all possibilities:



    diff = attack + shift - defence


    The three outcome classes can now be computed by comparing the difference against 0 and 5 to get arrays of Booleans, and then assembling the outcomes into a single array using numpy.stack:



    missed = diff <= 0
    destroyed = diff >= 5
    disabled = ~(missed | destroyed)
    outcome = np.stack((missed, disabled, destroyed), axis=1)


    The reason for choosing to stack along axis=1 is so that the outcome array has the right shape, that is, (25, 3, 6, 11). We multiply the last axis by the two-dice frequencies, and then sum over the last two axes. This leaves us with an array of frequencies with shape (25, 3) as required:



    return (outcome * two_dice_freq).sum(axis=(2, 3))


    If we had chosen to stack along axis=0, then at this point we would have an array with shape (3, 25) and we'd have to transpose it before returning. By choosing the right axis to stack along we avoided this transposition.




2. Revised code



import numpy as np

def attack_freq():
"""Return array with shape (25, 3), whose (i + 12)'th row contains the
frequencies of the three outcome classes (missed, disabled,
destroyed) when attack bonus minus defense bonus is i.

"""
two_dice_freq = [1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1]
shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]
diff = attack + shift - defence
missed = diff <= 0
destroyed = diff >= 5
disabled = ~(missed | destroyed)
outcome = np.stack((missed, disabled, destroyed), axis=1)
return (outcome * two_dice_freq).sum(axis=(2, 3))


This computes the same results as the code in the post:



>>> np.array_equal(calculate(), attack_freq())
True


but it's roughly four times as fast:



>>> from timeit import timeit
>>> timeit(calculate, number=1000)
0.38954256599993187
>>> timeit(attack_freq, number=1000)
0.09676001900004394


The actual runtimes are so small, less than a millisecond, that this speedup doesn't really matter in practice, since you'd only need to build this table once. However, the general technique, of applying a series of whole-array NumPy operations instead of looping in native Python, can make practical differences in other kinds of program, so it's worth practicing the technique even in small cases like this.






share|improve this answer





















    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "196"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209258%2fdice-sums-comparison-probability%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    1. Review




    1. The name calculate is vague. Something like attack_freq would be more specific.


    2. There's no docstring. What does calculate do? What does it return?


    3. The values in two_dice_prob are not probabilities, they are counts or frequencies. (To get probabilities, you'd have to divide by 36.) I would use a name like two_dice_freq.


    4. By default numpy.zeros gives you an array of floats, but the result array only contains integers, so you could specify dtype=int when creating it. (Alternatively, by constructing the result using NumPy throughout, it can be arranged that it has the right data type.)



    5. When working with NumPy it's nearly always fastest if you structure the code to consist of a sequence of whole-array operations, rather than looping over the elements in native Python.



      In this case we can use numpy.mgrid to construct arrays containing all possibilities for attack shift, defence roll and attack roll:



      shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]


      Then we can find the difference between attack and defence simultaneously for all possibilities:



      diff = attack + shift - defence


      The three outcome classes can now be computed by comparing the difference against 0 and 5 to get arrays of Booleans, and then assembling the outcomes into a single array using numpy.stack:



      missed = diff <= 0
      destroyed = diff >= 5
      disabled = ~(missed | destroyed)
      outcome = np.stack((missed, disabled, destroyed), axis=1)


      The reason for choosing to stack along axis=1 is so that the outcome array has the right shape, that is, (25, 3, 6, 11). We multiply the last axis by the two-dice frequencies, and then sum over the last two axes. This leaves us with an array of frequencies with shape (25, 3) as required:



      return (outcome * two_dice_freq).sum(axis=(2, 3))


      If we had chosen to stack along axis=0, then at this point we would have an array with shape (3, 25) and we'd have to transpose it before returning. By choosing the right axis to stack along we avoided this transposition.




    2. Revised code



    import numpy as np

    def attack_freq():
    """Return array with shape (25, 3), whose (i + 12)'th row contains the
    frequencies of the three outcome classes (missed, disabled,
    destroyed) when attack bonus minus defense bonus is i.

    """
    two_dice_freq = [1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1]
    shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]
    diff = attack + shift - defence
    missed = diff <= 0
    destroyed = diff >= 5
    disabled = ~(missed | destroyed)
    outcome = np.stack((missed, disabled, destroyed), axis=1)
    return (outcome * two_dice_freq).sum(axis=(2, 3))


    This computes the same results as the code in the post:



    >>> np.array_equal(calculate(), attack_freq())
    True


    but it's roughly four times as fast:



    >>> from timeit import timeit
    >>> timeit(calculate, number=1000)
    0.38954256599993187
    >>> timeit(attack_freq, number=1000)
    0.09676001900004394


    The actual runtimes are so small, less than a millisecond, that this speedup doesn't really matter in practice, since you'd only need to build this table once. However, the general technique, of applying a series of whole-array NumPy operations instead of looping in native Python, can make practical differences in other kinds of program, so it's worth practicing the technique even in small cases like this.






    share|improve this answer

























      up vote
      1
      down vote



      accepted










      1. Review




      1. The name calculate is vague. Something like attack_freq would be more specific.


      2. There's no docstring. What does calculate do? What does it return?


      3. The values in two_dice_prob are not probabilities, they are counts or frequencies. (To get probabilities, you'd have to divide by 36.) I would use a name like two_dice_freq.


      4. By default numpy.zeros gives you an array of floats, but the result array only contains integers, so you could specify dtype=int when creating it. (Alternatively, by constructing the result using NumPy throughout, it can be arranged that it has the right data type.)



      5. When working with NumPy it's nearly always fastest if you structure the code to consist of a sequence of whole-array operations, rather than looping over the elements in native Python.



        In this case we can use numpy.mgrid to construct arrays containing all possibilities for attack shift, defence roll and attack roll:



        shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]


        Then we can find the difference between attack and defence simultaneously for all possibilities:



        diff = attack + shift - defence


        The three outcome classes can now be computed by comparing the difference against 0 and 5 to get arrays of Booleans, and then assembling the outcomes into a single array using numpy.stack:



        missed = diff <= 0
        destroyed = diff >= 5
        disabled = ~(missed | destroyed)
        outcome = np.stack((missed, disabled, destroyed), axis=1)


        The reason for choosing to stack along axis=1 is so that the outcome array has the right shape, that is, (25, 3, 6, 11). We multiply the last axis by the two-dice frequencies, and then sum over the last two axes. This leaves us with an array of frequencies with shape (25, 3) as required:



        return (outcome * two_dice_freq).sum(axis=(2, 3))


        If we had chosen to stack along axis=0, then at this point we would have an array with shape (3, 25) and we'd have to transpose it before returning. By choosing the right axis to stack along we avoided this transposition.




      2. Revised code



      import numpy as np

      def attack_freq():
      """Return array with shape (25, 3), whose (i + 12)'th row contains the
      frequencies of the three outcome classes (missed, disabled,
      destroyed) when attack bonus minus defense bonus is i.

      """
      two_dice_freq = [1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1]
      shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]
      diff = attack + shift - defence
      missed = diff <= 0
      destroyed = diff >= 5
      disabled = ~(missed | destroyed)
      outcome = np.stack((missed, disabled, destroyed), axis=1)
      return (outcome * two_dice_freq).sum(axis=(2, 3))


      This computes the same results as the code in the post:



      >>> np.array_equal(calculate(), attack_freq())
      True


      but it's roughly four times as fast:



      >>> from timeit import timeit
      >>> timeit(calculate, number=1000)
      0.38954256599993187
      >>> timeit(attack_freq, number=1000)
      0.09676001900004394


      The actual runtimes are so small, less than a millisecond, that this speedup doesn't really matter in practice, since you'd only need to build this table once. However, the general technique, of applying a series of whole-array NumPy operations instead of looping in native Python, can make practical differences in other kinds of program, so it's worth practicing the technique even in small cases like this.






      share|improve this answer























        up vote
        1
        down vote



        accepted







        up vote
        1
        down vote



        accepted






        1. Review




        1. The name calculate is vague. Something like attack_freq would be more specific.


        2. There's no docstring. What does calculate do? What does it return?


        3. The values in two_dice_prob are not probabilities, they are counts or frequencies. (To get probabilities, you'd have to divide by 36.) I would use a name like two_dice_freq.


        4. By default numpy.zeros gives you an array of floats, but the result array only contains integers, so you could specify dtype=int when creating it. (Alternatively, by constructing the result using NumPy throughout, it can be arranged that it has the right data type.)



        5. When working with NumPy it's nearly always fastest if you structure the code to consist of a sequence of whole-array operations, rather than looping over the elements in native Python.



          In this case we can use numpy.mgrid to construct arrays containing all possibilities for attack shift, defence roll and attack roll:



          shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]


          Then we can find the difference between attack and defence simultaneously for all possibilities:



          diff = attack + shift - defence


          The three outcome classes can now be computed by comparing the difference against 0 and 5 to get arrays of Booleans, and then assembling the outcomes into a single array using numpy.stack:



          missed = diff <= 0
          destroyed = diff >= 5
          disabled = ~(missed | destroyed)
          outcome = np.stack((missed, disabled, destroyed), axis=1)


          The reason for choosing to stack along axis=1 is so that the outcome array has the right shape, that is, (25, 3, 6, 11). We multiply the last axis by the two-dice frequencies, and then sum over the last two axes. This leaves us with an array of frequencies with shape (25, 3) as required:



          return (outcome * two_dice_freq).sum(axis=(2, 3))


          If we had chosen to stack along axis=0, then at this point we would have an array with shape (3, 25) and we'd have to transpose it before returning. By choosing the right axis to stack along we avoided this transposition.




        2. Revised code



        import numpy as np

        def attack_freq():
        """Return array with shape (25, 3), whose (i + 12)'th row contains the
        frequencies of the three outcome classes (missed, disabled,
        destroyed) when attack bonus minus defense bonus is i.

        """
        two_dice_freq = [1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1]
        shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]
        diff = attack + shift - defence
        missed = diff <= 0
        destroyed = diff >= 5
        disabled = ~(missed | destroyed)
        outcome = np.stack((missed, disabled, destroyed), axis=1)
        return (outcome * two_dice_freq).sum(axis=(2, 3))


        This computes the same results as the code in the post:



        >>> np.array_equal(calculate(), attack_freq())
        True


        but it's roughly four times as fast:



        >>> from timeit import timeit
        >>> timeit(calculate, number=1000)
        0.38954256599993187
        >>> timeit(attack_freq, number=1000)
        0.09676001900004394


        The actual runtimes are so small, less than a millisecond, that this speedup doesn't really matter in practice, since you'd only need to build this table once. However, the general technique, of applying a series of whole-array NumPy operations instead of looping in native Python, can make practical differences in other kinds of program, so it's worth practicing the technique even in small cases like this.






        share|improve this answer












        1. Review




        1. The name calculate is vague. Something like attack_freq would be more specific.


        2. There's no docstring. What does calculate do? What does it return?


        3. The values in two_dice_prob are not probabilities, they are counts or frequencies. (To get probabilities, you'd have to divide by 36.) I would use a name like two_dice_freq.


        4. By default numpy.zeros gives you an array of floats, but the result array only contains integers, so you could specify dtype=int when creating it. (Alternatively, by constructing the result using NumPy throughout, it can be arranged that it has the right data type.)



        5. When working with NumPy it's nearly always fastest if you structure the code to consist of a sequence of whole-array operations, rather than looping over the elements in native Python.



          In this case we can use numpy.mgrid to construct arrays containing all possibilities for attack shift, defence roll and attack roll:



          shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]


          Then we can find the difference between attack and defence simultaneously for all possibilities:



          diff = attack + shift - defence


          The three outcome classes can now be computed by comparing the difference against 0 and 5 to get arrays of Booleans, and then assembling the outcomes into a single array using numpy.stack:



          missed = diff <= 0
          destroyed = diff >= 5
          disabled = ~(missed | destroyed)
          outcome = np.stack((missed, disabled, destroyed), axis=1)


          The reason for choosing to stack along axis=1 is so that the outcome array has the right shape, that is, (25, 3, 6, 11). We multiply the last axis by the two-dice frequencies, and then sum over the last two axes. This leaves us with an array of frequencies with shape (25, 3) as required:



          return (outcome * two_dice_freq).sum(axis=(2, 3))


          If we had chosen to stack along axis=0, then at this point we would have an array with shape (3, 25) and we'd have to transpose it before returning. By choosing the right axis to stack along we avoided this transposition.




        2. Revised code



        import numpy as np

        def attack_freq():
        """Return array with shape (25, 3), whose (i + 12)'th row contains the
        frequencies of the three outcome classes (missed, disabled,
        destroyed) when attack bonus minus defense bonus is i.

        """
        two_dice_freq = [1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1]
        shift, defence, attack = np.mgrid[-12:13, 1:7, 2:13]
        diff = attack + shift - defence
        missed = diff <= 0
        destroyed = diff >= 5
        disabled = ~(missed | destroyed)
        outcome = np.stack((missed, disabled, destroyed), axis=1)
        return (outcome * two_dice_freq).sum(axis=(2, 3))


        This computes the same results as the code in the post:



        >>> np.array_equal(calculate(), attack_freq())
        True


        but it's roughly four times as fast:



        >>> from timeit import timeit
        >>> timeit(calculate, number=1000)
        0.38954256599993187
        >>> timeit(attack_freq, number=1000)
        0.09676001900004394


        The actual runtimes are so small, less than a millisecond, that this speedup doesn't really matter in practice, since you'd only need to build this table once. However, the general technique, of applying a series of whole-array NumPy operations instead of looping in native Python, can make practical differences in other kinds of program, so it's worth practicing the technique even in small cases like this.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 9 hours ago









        Gareth Rees

        44.8k3100181




        44.8k3100181






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209258%2fdice-sums-comparison-probability%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            List directoties down one level, excluding some named directories and files

            list processes belonging to a network namespace

            list systemd RuntimeDirectory mounts