Python generator for “barely obtuse” triangles (Project Euler 224)












5














I am trying to solve Project Euler Problem 224:




Let us call an integer sided triangle with sides abc barely obtuse if the sides satisfy a2 + b2 = c2 - 1.



How many barely obtuse triangles are there with perimeter ≤ 75,000,000?




For large v_max and v_max_n, the generator below yields very slow when I have 750000000 and 500000000, respectively. Having tried for days, I am still looking for an algorithm that can do this in under a minute.



from itertools import product
import math

n = 15 * 10 ** 8
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3

(i for i in product(range(2, v_max_n + 1, 2),
range(2, v_max + 1, 2),
range(3, v_max + 1))
if i[0] <= i[1] and i[1] <= i[2] and sum(i) <= n and
i[0] ** 2 + i[1] ** 2 == i[2] ** 2 - 1)


I have tried using it as map(lambda x: x, gen), nested for loops, and have generally been racking my brain to no avail.



I tried re-writing b, c as expressions. This tends to be quick but misses some of the triples.



n = 12
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3
j = 0 # tested this with low n and compared: 12, 150 to see
for a in range(1, v_max_n + 1, 1):
b = (n ** 2 - 2 * n * a - 1) / (2 * (n - a))
c = (a ** 2 + b ** 2 + 1) ** 0.5
if c % 1 == 0 and b % 1 == 0 and a + int(b) + int(c) <= n:
# print(a, int(b), int(c))
j += 1
print(j)









share|improve this question
























  • For me this does not even fit into memory (16 GB). Which is weird, since product should not make any intermediate lists and neither should range...
    – Graipher
    Dec 17 at 11:28










  • @Graipher you should just get a generator object when you initialize the generator which shouldn't use any memory
    – dustin
    Dec 17 at 11:31






  • 1




    @Graipher sum(i) <= n. Added the variable definitions.
    – dustin
    Dec 17 at 11:36








  • 1




    It would probably help to tell what problem you are actually trying to achieve.
    – Josay
    Dec 17 at 16:24






  • 1




    A running time "under a minute" (or in a few minutes, but somewhere in that ballpark) is certainly possible, but not using a brute-force-search over (10^6)^3 possibilities.
    – tobias_k
    Dec 17 at 21:13
















5














I am trying to solve Project Euler Problem 224:




Let us call an integer sided triangle with sides abc barely obtuse if the sides satisfy a2 + b2 = c2 - 1.



How many barely obtuse triangles are there with perimeter ≤ 75,000,000?




For large v_max and v_max_n, the generator below yields very slow when I have 750000000 and 500000000, respectively. Having tried for days, I am still looking for an algorithm that can do this in under a minute.



from itertools import product
import math

n = 15 * 10 ** 8
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3

(i for i in product(range(2, v_max_n + 1, 2),
range(2, v_max + 1, 2),
range(3, v_max + 1))
if i[0] <= i[1] and i[1] <= i[2] and sum(i) <= n and
i[0] ** 2 + i[1] ** 2 == i[2] ** 2 - 1)


I have tried using it as map(lambda x: x, gen), nested for loops, and have generally been racking my brain to no avail.



I tried re-writing b, c as expressions. This tends to be quick but misses some of the triples.



n = 12
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3
j = 0 # tested this with low n and compared: 12, 150 to see
for a in range(1, v_max_n + 1, 1):
b = (n ** 2 - 2 * n * a - 1) / (2 * (n - a))
c = (a ** 2 + b ** 2 + 1) ** 0.5
if c % 1 == 0 and b % 1 == 0 and a + int(b) + int(c) <= n:
# print(a, int(b), int(c))
j += 1
print(j)









share|improve this question
























  • For me this does not even fit into memory (16 GB). Which is weird, since product should not make any intermediate lists and neither should range...
    – Graipher
    Dec 17 at 11:28










  • @Graipher you should just get a generator object when you initialize the generator which shouldn't use any memory
    – dustin
    Dec 17 at 11:31






  • 1




    @Graipher sum(i) <= n. Added the variable definitions.
    – dustin
    Dec 17 at 11:36








  • 1




    It would probably help to tell what problem you are actually trying to achieve.
    – Josay
    Dec 17 at 16:24






  • 1




    A running time "under a minute" (or in a few minutes, but somewhere in that ballpark) is certainly possible, but not using a brute-force-search over (10^6)^3 possibilities.
    – tobias_k
    Dec 17 at 21:13














5












5








5







I am trying to solve Project Euler Problem 224:




Let us call an integer sided triangle with sides abc barely obtuse if the sides satisfy a2 + b2 = c2 - 1.



How many barely obtuse triangles are there with perimeter ≤ 75,000,000?




For large v_max and v_max_n, the generator below yields very slow when I have 750000000 and 500000000, respectively. Having tried for days, I am still looking for an algorithm that can do this in under a minute.



from itertools import product
import math

n = 15 * 10 ** 8
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3

(i for i in product(range(2, v_max_n + 1, 2),
range(2, v_max + 1, 2),
range(3, v_max + 1))
if i[0] <= i[1] and i[1] <= i[2] and sum(i) <= n and
i[0] ** 2 + i[1] ** 2 == i[2] ** 2 - 1)


I have tried using it as map(lambda x: x, gen), nested for loops, and have generally been racking my brain to no avail.



I tried re-writing b, c as expressions. This tends to be quick but misses some of the triples.



n = 12
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3
j = 0 # tested this with low n and compared: 12, 150 to see
for a in range(1, v_max_n + 1, 1):
b = (n ** 2 - 2 * n * a - 1) / (2 * (n - a))
c = (a ** 2 + b ** 2 + 1) ** 0.5
if c % 1 == 0 and b % 1 == 0 and a + int(b) + int(c) <= n:
# print(a, int(b), int(c))
j += 1
print(j)









share|improve this question















I am trying to solve Project Euler Problem 224:




Let us call an integer sided triangle with sides abc barely obtuse if the sides satisfy a2 + b2 = c2 - 1.



How many barely obtuse triangles are there with perimeter ≤ 75,000,000?




For large v_max and v_max_n, the generator below yields very slow when I have 750000000 and 500000000, respectively. Having tried for days, I am still looking for an algorithm that can do this in under a minute.



from itertools import product
import math

n = 15 * 10 ** 8
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3

(i for i in product(range(2, v_max_n + 1, 2),
range(2, v_max + 1, 2),
range(3, v_max + 1))
if i[0] <= i[1] and i[1] <= i[2] and sum(i) <= n and
i[0] ** 2 + i[1] ** 2 == i[2] ** 2 - 1)


I have tried using it as map(lambda x: x, gen), nested for loops, and have generally been racking my brain to no avail.



I tried re-writing b, c as expressions. This tends to be quick but misses some of the triples.



n = 12
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3
j = 0 # tested this with low n and compared: 12, 150 to see
for a in range(1, v_max_n + 1, 1):
b = (n ** 2 - 2 * n * a - 1) / (2 * (n - a))
c = (a ** 2 + b ** 2 + 1) ** 0.5
if c % 1 == 0 and b % 1 == 0 and a + int(b) + int(c) <= n:
# print(a, int(b), int(c))
j += 1
print(j)






python programming-challenge time-limit-exceeded iterator generator






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 18 at 2:18

























asked Dec 17 at 11:20









dustin

567




567












  • For me this does not even fit into memory (16 GB). Which is weird, since product should not make any intermediate lists and neither should range...
    – Graipher
    Dec 17 at 11:28










  • @Graipher you should just get a generator object when you initialize the generator which shouldn't use any memory
    – dustin
    Dec 17 at 11:31






  • 1




    @Graipher sum(i) <= n. Added the variable definitions.
    – dustin
    Dec 17 at 11:36








  • 1




    It would probably help to tell what problem you are actually trying to achieve.
    – Josay
    Dec 17 at 16:24






  • 1




    A running time "under a minute" (or in a few minutes, but somewhere in that ballpark) is certainly possible, but not using a brute-force-search over (10^6)^3 possibilities.
    – tobias_k
    Dec 17 at 21:13


















  • For me this does not even fit into memory (16 GB). Which is weird, since product should not make any intermediate lists and neither should range...
    – Graipher
    Dec 17 at 11:28










  • @Graipher you should just get a generator object when you initialize the generator which shouldn't use any memory
    – dustin
    Dec 17 at 11:31






  • 1




    @Graipher sum(i) <= n. Added the variable definitions.
    – dustin
    Dec 17 at 11:36








  • 1




    It would probably help to tell what problem you are actually trying to achieve.
    – Josay
    Dec 17 at 16:24






  • 1




    A running time "under a minute" (or in a few minutes, but somewhere in that ballpark) is certainly possible, but not using a brute-force-search over (10^6)^3 possibilities.
    – tobias_k
    Dec 17 at 21:13
















For me this does not even fit into memory (16 GB). Which is weird, since product should not make any intermediate lists and neither should range...
– Graipher
Dec 17 at 11:28




For me this does not even fit into memory (16 GB). Which is weird, since product should not make any intermediate lists and neither should range...
– Graipher
Dec 17 at 11:28












@Graipher you should just get a generator object when you initialize the generator which shouldn't use any memory
– dustin
Dec 17 at 11:31




@Graipher you should just get a generator object when you initialize the generator which shouldn't use any memory
– dustin
Dec 17 at 11:31




1




1




@Graipher sum(i) <= n. Added the variable definitions.
– dustin
Dec 17 at 11:36






@Graipher sum(i) <= n. Added the variable definitions.
– dustin
Dec 17 at 11:36






1




1




It would probably help to tell what problem you are actually trying to achieve.
– Josay
Dec 17 at 16:24




It would probably help to tell what problem you are actually trying to achieve.
– Josay
Dec 17 at 16:24




1




1




A running time "under a minute" (or in a few minutes, but somewhere in that ballpark) is certainly possible, but not using a brute-force-search over (10^6)^3 possibilities.
– tobias_k
Dec 17 at 21:13




A running time "under a minute" (or in a few minutes, but somewhere in that ballpark) is certainly possible, but not using a brute-force-search over (10^6)^3 possibilities.
– tobias_k
Dec 17 at 21:13










3 Answers
3






active

oldest

votes


















3














It seems like you want to find a solution for a²+b²=c²-1 with some added constraints, like a<=b<=c, a+b+c<=n, etc. by trying all possible combinations of a, b, and c and testing whether they satisfy all your conditions at once. This is pretty wasteful.



Instead of itertools.product, you can use a generator expression with three regular for loops. This has several advantages:





  • you could move some of the if clauses up after the first or second loop, i.e. you don't have to run the other loops if those conditions already fail at this point, e.g.



     for x in ... if test(x) for y in ... if test(y) ...


  • you can use the values of your previous variables in the lower and upper bounds for the later variables (in your case this replaces most of your if conditions)


  • you can basically calculate the value for c from a and b without testing all possible values


This is how I'd do it:



gen = ((a, b, int(c))
for a in range(2, v_max_n + 1, 2)
for b in range(a, min(v_max + 1, n - a - a), 2) # use a for lower and upper bounds
for c in [(a ** 2 + b ** 2 + 1)**0.5] # just a single candidate for c
if c % 1 == 0) # whole-numbered c found?


Note that the calculation of c using very_large_number**0.5 might be imprecise with float; using decimal might work, though. However, even with those optimizations, testing much fewer values than your original loop (on the order of O(n²) instead of O(n³)), it might not be feasible to find solution for large values of a, b and c.



Also note that I did not thoroughly test this for off-by-one errors, since performance was the main concern.



Another thing that you might try: Invert the loops for a and for b, i.e. instead of testing all b for each a, no matter how large the difference, test all a up to b first. This does not decrease the number of combinations to test, but it seems to yield much more valid combinations in the "smaller" number much faster. Like this (upper bound for b can probably be reduced):



       for b in range(2, v_max + 1, 2)
for a in range(2, b+1, 2)





share|improve this answer























  • It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
    – dustin
    Dec 17 at 14:53












  • @dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
    – tobias_k
    Dec 17 at 15:03










  • I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
    – dustin
    Dec 17 at 15:05










  • I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
    – Oscar Smith
    Dec 17 at 15:26






  • 1




    yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.
    – Oscar Smith
    Dec 17 at 15:35



















2















v_max_n = n // 3



The name of this variable is a mystery to me, but you seem to be using it as an upper bound on $a$. You can get a notably better upper bound on $a$ very easily: since $a le b$ and $c^2 = a^2 + b^2 + 1$, $a + b + c ge 2a + sqrt{2a^2 + 1} > 2a + sqrt{2a^2} = (2 + sqrt 2)a$. Then your upper bound on $a$ can be reduced by 12.13%.





Given a value of $a$, the constraints on $b$ are $a le b$ and $a + b + sqrt{a^2+b^2+1} le n$. Rearrange and square: $a^2 + b^2 + 1 le n^2 + a^2 + b^2 - 2an +2ab -2bn$ or $b le frac{n^2 - 2an - 1}{2n - 2a}$. I'm not sure why you appear to be using this bound as a forced value in the code you added to the question.





$$sum_{a=1}^{frac{n}{2 + sqrt 2}} left(frac{n^2 - 2an - 1}{2n - 2a} - aright)
= left(sum_{a=1}^{frac{n}{2 + sqrt 2}} frac{n^2 - 2an - 1}{2n - 2a}right) - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
= frac{n^2 + 1}2 left(psileft(frac{n}{2 + sqrt 2}-n+1right) - psi(1-n)right) + frac{n^2}{2 + sqrt 2} - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
= frac{n^2 + 1}2 left(psileft(1-frac{1+sqrt2}{2 + sqrt 2}nright) - psi(1-n)right) + frac{(3 + 2sqrt 2)n^2 + (2 + sqrt 2)n}{2(2 + sqrt 2)^2} \
> 0.9n^2
$$

The step from the sum to an expression in terms of the digamma function is courtesy of Wolfram Alpha.



As a rule of thumb, you don't want to be repeating a loop body $10^{12}$ times, so looping over all values of $a$ and all values of $b$ is a non-starter.





Hint: try rearranging $a^2 + b^2 = c^2 - 1$.






share|improve this answer





























    1














    One improvement over @tobias_k s answer is to add the extra upper bound on b of b<1 + a*a//2. This is valid because after that, the distance between b^2 and (b+1)^2 will be bigger than a^2, so c^2 does not exist. This provides a massive speedup at least a first.






    share|improve this answer





















    • Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance
      – dustin
      Dec 17 at 16:17










    • Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)
      – tobias_k
      Dec 17 at 16:23












    • @dustin How did you add that "extra check"? I added it to the min; did you do something else?
      – tobias_k
      Dec 17 at 16:26










    • @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.
      – dustin
      Dec 17 at 16:35












    • @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.
      – tobias_k
      Dec 17 at 16:39













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "196"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209821%2fpython-generator-for-barely-obtuse-triangles-project-euler-224%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3














    It seems like you want to find a solution for a²+b²=c²-1 with some added constraints, like a<=b<=c, a+b+c<=n, etc. by trying all possible combinations of a, b, and c and testing whether they satisfy all your conditions at once. This is pretty wasteful.



    Instead of itertools.product, you can use a generator expression with three regular for loops. This has several advantages:





    • you could move some of the if clauses up after the first or second loop, i.e. you don't have to run the other loops if those conditions already fail at this point, e.g.



       for x in ... if test(x) for y in ... if test(y) ...


    • you can use the values of your previous variables in the lower and upper bounds for the later variables (in your case this replaces most of your if conditions)


    • you can basically calculate the value for c from a and b without testing all possible values


    This is how I'd do it:



    gen = ((a, b, int(c))
    for a in range(2, v_max_n + 1, 2)
    for b in range(a, min(v_max + 1, n - a - a), 2) # use a for lower and upper bounds
    for c in [(a ** 2 + b ** 2 + 1)**0.5] # just a single candidate for c
    if c % 1 == 0) # whole-numbered c found?


    Note that the calculation of c using very_large_number**0.5 might be imprecise with float; using decimal might work, though. However, even with those optimizations, testing much fewer values than your original loop (on the order of O(n²) instead of O(n³)), it might not be feasible to find solution for large values of a, b and c.



    Also note that I did not thoroughly test this for off-by-one errors, since performance was the main concern.



    Another thing that you might try: Invert the loops for a and for b, i.e. instead of testing all b for each a, no matter how large the difference, test all a up to b first. This does not decrease the number of combinations to test, but it seems to yield much more valid combinations in the "smaller" number much faster. Like this (upper bound for b can probably be reduced):



           for b in range(2, v_max + 1, 2)
    for a in range(2, b+1, 2)





    share|improve this answer























    • It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
      – dustin
      Dec 17 at 14:53












    • @dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
      – tobias_k
      Dec 17 at 15:03










    • I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
      – dustin
      Dec 17 at 15:05










    • I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
      – Oscar Smith
      Dec 17 at 15:26






    • 1




      yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.
      – Oscar Smith
      Dec 17 at 15:35
















    3














    It seems like you want to find a solution for a²+b²=c²-1 with some added constraints, like a<=b<=c, a+b+c<=n, etc. by trying all possible combinations of a, b, and c and testing whether they satisfy all your conditions at once. This is pretty wasteful.



    Instead of itertools.product, you can use a generator expression with three regular for loops. This has several advantages:





    • you could move some of the if clauses up after the first or second loop, i.e. you don't have to run the other loops if those conditions already fail at this point, e.g.



       for x in ... if test(x) for y in ... if test(y) ...


    • you can use the values of your previous variables in the lower and upper bounds for the later variables (in your case this replaces most of your if conditions)


    • you can basically calculate the value for c from a and b without testing all possible values


    This is how I'd do it:



    gen = ((a, b, int(c))
    for a in range(2, v_max_n + 1, 2)
    for b in range(a, min(v_max + 1, n - a - a), 2) # use a for lower and upper bounds
    for c in [(a ** 2 + b ** 2 + 1)**0.5] # just a single candidate for c
    if c % 1 == 0) # whole-numbered c found?


    Note that the calculation of c using very_large_number**0.5 might be imprecise with float; using decimal might work, though. However, even with those optimizations, testing much fewer values than your original loop (on the order of O(n²) instead of O(n³)), it might not be feasible to find solution for large values of a, b and c.



    Also note that I did not thoroughly test this for off-by-one errors, since performance was the main concern.



    Another thing that you might try: Invert the loops for a and for b, i.e. instead of testing all b for each a, no matter how large the difference, test all a up to b first. This does not decrease the number of combinations to test, but it seems to yield much more valid combinations in the "smaller" number much faster. Like this (upper bound for b can probably be reduced):



           for b in range(2, v_max + 1, 2)
    for a in range(2, b+1, 2)





    share|improve this answer























    • It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
      – dustin
      Dec 17 at 14:53












    • @dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
      – tobias_k
      Dec 17 at 15:03










    • I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
      – dustin
      Dec 17 at 15:05










    • I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
      – Oscar Smith
      Dec 17 at 15:26






    • 1




      yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.
      – Oscar Smith
      Dec 17 at 15:35














    3












    3








    3






    It seems like you want to find a solution for a²+b²=c²-1 with some added constraints, like a<=b<=c, a+b+c<=n, etc. by trying all possible combinations of a, b, and c and testing whether they satisfy all your conditions at once. This is pretty wasteful.



    Instead of itertools.product, you can use a generator expression with three regular for loops. This has several advantages:





    • you could move some of the if clauses up after the first or second loop, i.e. you don't have to run the other loops if those conditions already fail at this point, e.g.



       for x in ... if test(x) for y in ... if test(y) ...


    • you can use the values of your previous variables in the lower and upper bounds for the later variables (in your case this replaces most of your if conditions)


    • you can basically calculate the value for c from a and b without testing all possible values


    This is how I'd do it:



    gen = ((a, b, int(c))
    for a in range(2, v_max_n + 1, 2)
    for b in range(a, min(v_max + 1, n - a - a), 2) # use a for lower and upper bounds
    for c in [(a ** 2 + b ** 2 + 1)**0.5] # just a single candidate for c
    if c % 1 == 0) # whole-numbered c found?


    Note that the calculation of c using very_large_number**0.5 might be imprecise with float; using decimal might work, though. However, even with those optimizations, testing much fewer values than your original loop (on the order of O(n²) instead of O(n³)), it might not be feasible to find solution for large values of a, b and c.



    Also note that I did not thoroughly test this for off-by-one errors, since performance was the main concern.



    Another thing that you might try: Invert the loops for a and for b, i.e. instead of testing all b for each a, no matter how large the difference, test all a up to b first. This does not decrease the number of combinations to test, but it seems to yield much more valid combinations in the "smaller" number much faster. Like this (upper bound for b can probably be reduced):



           for b in range(2, v_max + 1, 2)
    for a in range(2, b+1, 2)





    share|improve this answer














    It seems like you want to find a solution for a²+b²=c²-1 with some added constraints, like a<=b<=c, a+b+c<=n, etc. by trying all possible combinations of a, b, and c and testing whether they satisfy all your conditions at once. This is pretty wasteful.



    Instead of itertools.product, you can use a generator expression with three regular for loops. This has several advantages:





    • you could move some of the if clauses up after the first or second loop, i.e. you don't have to run the other loops if those conditions already fail at this point, e.g.



       for x in ... if test(x) for y in ... if test(y) ...


    • you can use the values of your previous variables in the lower and upper bounds for the later variables (in your case this replaces most of your if conditions)


    • you can basically calculate the value for c from a and b without testing all possible values


    This is how I'd do it:



    gen = ((a, b, int(c))
    for a in range(2, v_max_n + 1, 2)
    for b in range(a, min(v_max + 1, n - a - a), 2) # use a for lower and upper bounds
    for c in [(a ** 2 + b ** 2 + 1)**0.5] # just a single candidate for c
    if c % 1 == 0) # whole-numbered c found?


    Note that the calculation of c using very_large_number**0.5 might be imprecise with float; using decimal might work, though. However, even with those optimizations, testing much fewer values than your original loop (on the order of O(n²) instead of O(n³)), it might not be feasible to find solution for large values of a, b and c.



    Also note that I did not thoroughly test this for off-by-one errors, since performance was the main concern.



    Another thing that you might try: Invert the loops for a and for b, i.e. instead of testing all b for each a, no matter how large the difference, test all a up to b first. This does not decrease the number of combinations to test, but it seems to yield much more valid combinations in the "smaller" number much faster. Like this (upper bound for b can probably be reduced):



           for b in range(2, v_max + 1, 2)
    for a in range(2, b+1, 2)






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Dec 17 at 15:05

























    answered Dec 17 at 14:51









    tobias_k

    1,509916




    1,509916












    • It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
      – dustin
      Dec 17 at 14:53












    • @dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
      – tobias_k
      Dec 17 at 15:03










    • I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
      – dustin
      Dec 17 at 15:05










    • I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
      – Oscar Smith
      Dec 17 at 15:26






    • 1




      yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.
      – Oscar Smith
      Dec 17 at 15:35


















    • It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
      – dustin
      Dec 17 at 14:53












    • @dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
      – tobias_k
      Dec 17 at 15:03










    • I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
      – dustin
      Dec 17 at 15:05










    • I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
      – Oscar Smith
      Dec 17 at 15:26






    • 1




      yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.
      – Oscar Smith
      Dec 17 at 15:35
















    It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
    – dustin
    Dec 17 at 14:53






    It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
    – dustin
    Dec 17 at 14:53














    @dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
    – tobias_k
    Dec 17 at 15:03




    @dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
    – tobias_k
    Dec 17 at 15:03












    I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
    – dustin
    Dec 17 at 15:05




    I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
    – dustin
    Dec 17 at 15:05












    I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
    – Oscar Smith
    Dec 17 at 15:26




    I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
    – Oscar Smith
    Dec 17 at 15:26




    1




    1




    yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.
    – Oscar Smith
    Dec 17 at 15:35




    yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.
    – Oscar Smith
    Dec 17 at 15:35













    2















    v_max_n = n // 3



    The name of this variable is a mystery to me, but you seem to be using it as an upper bound on $a$. You can get a notably better upper bound on $a$ very easily: since $a le b$ and $c^2 = a^2 + b^2 + 1$, $a + b + c ge 2a + sqrt{2a^2 + 1} > 2a + sqrt{2a^2} = (2 + sqrt 2)a$. Then your upper bound on $a$ can be reduced by 12.13%.





    Given a value of $a$, the constraints on $b$ are $a le b$ and $a + b + sqrt{a^2+b^2+1} le n$. Rearrange and square: $a^2 + b^2 + 1 le n^2 + a^2 + b^2 - 2an +2ab -2bn$ or $b le frac{n^2 - 2an - 1}{2n - 2a}$. I'm not sure why you appear to be using this bound as a forced value in the code you added to the question.





    $$sum_{a=1}^{frac{n}{2 + sqrt 2}} left(frac{n^2 - 2an - 1}{2n - 2a} - aright)
    = left(sum_{a=1}^{frac{n}{2 + sqrt 2}} frac{n^2 - 2an - 1}{2n - 2a}right) - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
    = frac{n^2 + 1}2 left(psileft(frac{n}{2 + sqrt 2}-n+1right) - psi(1-n)right) + frac{n^2}{2 + sqrt 2} - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
    = frac{n^2 + 1}2 left(psileft(1-frac{1+sqrt2}{2 + sqrt 2}nright) - psi(1-n)right) + frac{(3 + 2sqrt 2)n^2 + (2 + sqrt 2)n}{2(2 + sqrt 2)^2} \
    > 0.9n^2
    $$

    The step from the sum to an expression in terms of the digamma function is courtesy of Wolfram Alpha.



    As a rule of thumb, you don't want to be repeating a loop body $10^{12}$ times, so looping over all values of $a$ and all values of $b$ is a non-starter.





    Hint: try rearranging $a^2 + b^2 = c^2 - 1$.






    share|improve this answer


























      2















      v_max_n = n // 3



      The name of this variable is a mystery to me, but you seem to be using it as an upper bound on $a$. You can get a notably better upper bound on $a$ very easily: since $a le b$ and $c^2 = a^2 + b^2 + 1$, $a + b + c ge 2a + sqrt{2a^2 + 1} > 2a + sqrt{2a^2} = (2 + sqrt 2)a$. Then your upper bound on $a$ can be reduced by 12.13%.





      Given a value of $a$, the constraints on $b$ are $a le b$ and $a + b + sqrt{a^2+b^2+1} le n$. Rearrange and square: $a^2 + b^2 + 1 le n^2 + a^2 + b^2 - 2an +2ab -2bn$ or $b le frac{n^2 - 2an - 1}{2n - 2a}$. I'm not sure why you appear to be using this bound as a forced value in the code you added to the question.





      $$sum_{a=1}^{frac{n}{2 + sqrt 2}} left(frac{n^2 - 2an - 1}{2n - 2a} - aright)
      = left(sum_{a=1}^{frac{n}{2 + sqrt 2}} frac{n^2 - 2an - 1}{2n - 2a}right) - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
      = frac{n^2 + 1}2 left(psileft(frac{n}{2 + sqrt 2}-n+1right) - psi(1-n)right) + frac{n^2}{2 + sqrt 2} - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
      = frac{n^2 + 1}2 left(psileft(1-frac{1+sqrt2}{2 + sqrt 2}nright) - psi(1-n)right) + frac{(3 + 2sqrt 2)n^2 + (2 + sqrt 2)n}{2(2 + sqrt 2)^2} \
      > 0.9n^2
      $$

      The step from the sum to an expression in terms of the digamma function is courtesy of Wolfram Alpha.



      As a rule of thumb, you don't want to be repeating a loop body $10^{12}$ times, so looping over all values of $a$ and all values of $b$ is a non-starter.





      Hint: try rearranging $a^2 + b^2 = c^2 - 1$.






      share|improve this answer
























        2












        2








        2







        v_max_n = n // 3



        The name of this variable is a mystery to me, but you seem to be using it as an upper bound on $a$. You can get a notably better upper bound on $a$ very easily: since $a le b$ and $c^2 = a^2 + b^2 + 1$, $a + b + c ge 2a + sqrt{2a^2 + 1} > 2a + sqrt{2a^2} = (2 + sqrt 2)a$. Then your upper bound on $a$ can be reduced by 12.13%.





        Given a value of $a$, the constraints on $b$ are $a le b$ and $a + b + sqrt{a^2+b^2+1} le n$. Rearrange and square: $a^2 + b^2 + 1 le n^2 + a^2 + b^2 - 2an +2ab -2bn$ or $b le frac{n^2 - 2an - 1}{2n - 2a}$. I'm not sure why you appear to be using this bound as a forced value in the code you added to the question.





        $$sum_{a=1}^{frac{n}{2 + sqrt 2}} left(frac{n^2 - 2an - 1}{2n - 2a} - aright)
        = left(sum_{a=1}^{frac{n}{2 + sqrt 2}} frac{n^2 - 2an - 1}{2n - 2a}right) - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
        = frac{n^2 + 1}2 left(psileft(frac{n}{2 + sqrt 2}-n+1right) - psi(1-n)right) + frac{n^2}{2 + sqrt 2} - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
        = frac{n^2 + 1}2 left(psileft(1-frac{1+sqrt2}{2 + sqrt 2}nright) - psi(1-n)right) + frac{(3 + 2sqrt 2)n^2 + (2 + sqrt 2)n}{2(2 + sqrt 2)^2} \
        > 0.9n^2
        $$

        The step from the sum to an expression in terms of the digamma function is courtesy of Wolfram Alpha.



        As a rule of thumb, you don't want to be repeating a loop body $10^{12}$ times, so looping over all values of $a$ and all values of $b$ is a non-starter.





        Hint: try rearranging $a^2 + b^2 = c^2 - 1$.






        share|improve this answer













        v_max_n = n // 3



        The name of this variable is a mystery to me, but you seem to be using it as an upper bound on $a$. You can get a notably better upper bound on $a$ very easily: since $a le b$ and $c^2 = a^2 + b^2 + 1$, $a + b + c ge 2a + sqrt{2a^2 + 1} > 2a + sqrt{2a^2} = (2 + sqrt 2)a$. Then your upper bound on $a$ can be reduced by 12.13%.





        Given a value of $a$, the constraints on $b$ are $a le b$ and $a + b + sqrt{a^2+b^2+1} le n$. Rearrange and square: $a^2 + b^2 + 1 le n^2 + a^2 + b^2 - 2an +2ab -2bn$ or $b le frac{n^2 - 2an - 1}{2n - 2a}$. I'm not sure why you appear to be using this bound as a forced value in the code you added to the question.





        $$sum_{a=1}^{frac{n}{2 + sqrt 2}} left(frac{n^2 - 2an - 1}{2n - 2a} - aright)
        = left(sum_{a=1}^{frac{n}{2 + sqrt 2}} frac{n^2 - 2an - 1}{2n - 2a}right) - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
        = frac{n^2 + 1}2 left(psileft(frac{n}{2 + sqrt 2}-n+1right) - psi(1-n)right) + frac{n^2}{2 + sqrt 2} - frac{n(n + 2 + sqrt 2)}{2(2 + sqrt 2)^2} \
        = frac{n^2 + 1}2 left(psileft(1-frac{1+sqrt2}{2 + sqrt 2}nright) - psi(1-n)right) + frac{(3 + 2sqrt 2)n^2 + (2 + sqrt 2)n}{2(2 + sqrt 2)^2} \
        > 0.9n^2
        $$

        The step from the sum to an expression in terms of the digamma function is courtesy of Wolfram Alpha.



        As a rule of thumb, you don't want to be repeating a loop body $10^{12}$ times, so looping over all values of $a$ and all values of $b$ is a non-starter.





        Hint: try rearranging $a^2 + b^2 = c^2 - 1$.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 18 at 10:15









        Peter Taylor

        15.7k2658




        15.7k2658























            1














            One improvement over @tobias_k s answer is to add the extra upper bound on b of b<1 + a*a//2. This is valid because after that, the distance between b^2 and (b+1)^2 will be bigger than a^2, so c^2 does not exist. This provides a massive speedup at least a first.






            share|improve this answer





















            • Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance
              – dustin
              Dec 17 at 16:17










            • Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)
              – tobias_k
              Dec 17 at 16:23












            • @dustin How did you add that "extra check"? I added it to the min; did you do something else?
              – tobias_k
              Dec 17 at 16:26










            • @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.
              – dustin
              Dec 17 at 16:35












            • @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.
              – tobias_k
              Dec 17 at 16:39


















            1














            One improvement over @tobias_k s answer is to add the extra upper bound on b of b<1 + a*a//2. This is valid because after that, the distance between b^2 and (b+1)^2 will be bigger than a^2, so c^2 does not exist. This provides a massive speedup at least a first.






            share|improve this answer





















            • Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance
              – dustin
              Dec 17 at 16:17










            • Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)
              – tobias_k
              Dec 17 at 16:23












            • @dustin How did you add that "extra check"? I added it to the min; did you do something else?
              – tobias_k
              Dec 17 at 16:26










            • @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.
              – dustin
              Dec 17 at 16:35












            • @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.
              – tobias_k
              Dec 17 at 16:39
















            1












            1








            1






            One improvement over @tobias_k s answer is to add the extra upper bound on b of b<1 + a*a//2. This is valid because after that, the distance between b^2 and (b+1)^2 will be bigger than a^2, so c^2 does not exist. This provides a massive speedup at least a first.






            share|improve this answer












            One improvement over @tobias_k s answer is to add the extra upper bound on b of b<1 + a*a//2. This is valid because after that, the distance between b^2 and (b+1)^2 will be bigger than a^2, so c^2 does not exist. This provides a massive speedup at least a first.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Dec 17 at 16:08









            Oscar Smith

            2,718922




            2,718922












            • Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance
              – dustin
              Dec 17 at 16:17










            • Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)
              – tobias_k
              Dec 17 at 16:23












            • @dustin How did you add that "extra check"? I added it to the min; did you do something else?
              – tobias_k
              Dec 17 at 16:26










            • @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.
              – dustin
              Dec 17 at 16:35












            • @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.
              – tobias_k
              Dec 17 at 16:39




















            • Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance
              – dustin
              Dec 17 at 16:17










            • Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)
              – tobias_k
              Dec 17 at 16:23












            • @dustin How did you add that "extra check"? I added it to the min; did you do something else?
              – tobias_k
              Dec 17 at 16:26










            • @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.
              – dustin
              Dec 17 at 16:35












            • @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.
              – tobias_k
              Dec 17 at 16:39


















            Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance
            – dustin
            Dec 17 at 16:17




            Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance
            – dustin
            Dec 17 at 16:17












            Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)
            – tobias_k
            Dec 17 at 16:23






            Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)
            – tobias_k
            Dec 17 at 16:23














            @dustin How did you add that "extra check"? I added it to the min; did you do something else?
            – tobias_k
            Dec 17 at 16:26




            @dustin How did you add that "extra check"? I added it to the min; did you do something else?
            – tobias_k
            Dec 17 at 16:26












            @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.
            – dustin
            Dec 17 at 16:35






            @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.
            – dustin
            Dec 17 at 16:35














            @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.
            – tobias_k
            Dec 17 at 16:39






            @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.
            – tobias_k
            Dec 17 at 16:39




















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209821%2fpython-generator-for-barely-obtuse-triangles-project-euler-224%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Morgemoulin

            Scott Moir

            Souastre