How to construct a list of lengths efficiently











up vote
17
down vote

favorite
6












Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...










share|improve this question




















  • 1




    What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    2 days ago















up vote
17
down vote

favorite
6












Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...










share|improve this question




















  • 1




    What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    2 days ago













up vote
17
down vote

favorite
6









up vote
17
down vote

favorite
6






6





Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...










share|improve this question















Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...







list-manipulation performance-tuning filtering






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago









Alexey Popkov

38k4104260




38k4104260










asked Nov 13 at 1:00









AccidentalFourierTransform

4,9581941




4,9581941








  • 1




    What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    2 days ago














  • 1




    What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    2 days ago








1




1




What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
2 days ago




What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
2 days ago










4 Answers
4






active

oldest

votes

















up vote
20
down vote



accepted










You can use the usual UnitStep + Total tricks:



r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

r1 === r2



{0.435358, Null}



{41.4357, Null}



True




Update



As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



mincounts[s_] := With[
{
unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
},
With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
]
]


Comparison:



SeedRandom[1];
s=RandomInteger[{1,100000},10000]//Sort;

(* my first answer *)
r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
(* J42161217's answer *)
r2 = Flatten[
Join[
{Table[0, s[[1]] - 1]},
Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
]
][[;;10000]]; // AbsoluteTiming
(* using Nearest *)
r3 = mincounts[s]; //AbsoluteTiming

r1 === r2 === r3



{0.432897, Null}



{0.122198, Null}



{0.025923, Null}



True







share|improve this answer























  • Ah, great answer, as usual.
    – AccidentalFourierTransform
    Nov 13 at 1:19










  • Can you please check my answer because my laptop is very slow
    – J42161217
    Nov 13 at 3:05


















up vote
13
down vote













BinCounts and Accumulate combination is faster than all the methods posted so far:



r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



0.00069




versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
1 ;; Length[s]]]
]; // RepeatedTiming // First



 0.00081




r3 = mincounts[s]; // RepeatedTiming // First



0.016




r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
RepeatedTiming // First



0.149




r2 == r3 == r4 == r5



True







share|improve this answer























  • Beat me to it - BinCounts is the way... +1
    – ciao
    Nov 13 at 6:49






  • 1




    Hey @ciao, you are back?!!
    – kglr
    Nov 13 at 6:53










  • Sorry @Henrik; thanks for the edit.
    – kglr
    2 days ago






  • 1




    Short and fast. No need sorting.. Excellent!!... +1
    – Okkes Dulgerci
    2 days ago










  • @kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
    – ciao
    2 days ago


















up vote
12
down vote













I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?



r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
Table[Table[i, Differences[s][[i]]], {i,
Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



{0.157123, Null}




Using MapThread the same code is way faster



r6 = Flatten[
Join[{Table[0, s[[1]] - 1]},
MapThread[
Table, {Range[t = Length[Select[s, # <= 10000 &]]],
Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

r6===r3



{0.008387, Null}



True







share|improve this answer



















  • 1




    These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
    – Okkes Dulgerci
    Nov 13 at 3:55












  • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
    – J42161217
    Nov 13 at 4:01


















up vote
5
down vote













s = Sort[RandomInteger[{1, 100000}, 10000]];


Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



This can be done with this function:



mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
If[(Head[rules] === Rule) && (rules[[1]] === {}),
rules[[2]],
With[{spopt = SystemOptions["SparseArrayOptions"]},
Internal`WithLocalSettings[
SetSystemOptions[
"SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
SparseArray[rules, dims, background],
SetSystemOptions[spopt]]
]
]


So, in total, r can be obtained as follows:



r4 = Accumulate[
mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
]; // RepeatedTiming // First



0.00055




For comparison:



r3 = Flatten[
Join[{Table[0, s[[1]] - 1]},
Table[Table[i, Differences[s][[i]]], {i,
Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
RepeatedTiming // First
r3 == r4



0.116



True







share|improve this answer























    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "387"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f185874%2fhow-to-construct-a-list-of-lengths-efficiently%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    20
    down vote



    accepted










    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True







    share|improve this answer























    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      Nov 13 at 1:19










    • Can you please check my answer because my laptop is very slow
      – J42161217
      Nov 13 at 3:05















    up vote
    20
    down vote



    accepted










    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True







    share|improve this answer























    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      Nov 13 at 1:19










    • Can you please check my answer because my laptop is very slow
      – J42161217
      Nov 13 at 3:05













    up vote
    20
    down vote



    accepted







    up vote
    20
    down vote



    accepted






    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True







    share|improve this answer














    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 13 at 4:11

























    answered Nov 13 at 1:11









    Carl Woll

    65k285170




    65k285170












    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      Nov 13 at 1:19










    • Can you please check my answer because my laptop is very slow
      – J42161217
      Nov 13 at 3:05


















    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      Nov 13 at 1:19










    • Can you please check my answer because my laptop is very slow
      – J42161217
      Nov 13 at 3:05
















    Ah, great answer, as usual.
    – AccidentalFourierTransform
    Nov 13 at 1:19




    Ah, great answer, as usual.
    – AccidentalFourierTransform
    Nov 13 at 1:19












    Can you please check my answer because my laptop is very slow
    – J42161217
    Nov 13 at 3:05




    Can you please check my answer because my laptop is very slow
    – J42161217
    Nov 13 at 3:05










    up vote
    13
    down vote













    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True







    share|improve this answer























    • Beat me to it - BinCounts is the way... +1
      – ciao
      Nov 13 at 6:49






    • 1




      Hey @ciao, you are back?!!
      – kglr
      Nov 13 at 6:53










    • Sorry @Henrik; thanks for the edit.
      – kglr
      2 days ago






    • 1




      Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 days ago










    • @kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
      – ciao
      2 days ago















    up vote
    13
    down vote













    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True







    share|improve this answer























    • Beat me to it - BinCounts is the way... +1
      – ciao
      Nov 13 at 6:49






    • 1




      Hey @ciao, you are back?!!
      – kglr
      Nov 13 at 6:53










    • Sorry @Henrik; thanks for the edit.
      – kglr
      2 days ago






    • 1




      Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 days ago










    • @kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
      – ciao
      2 days ago













    up vote
    13
    down vote










    up vote
    13
    down vote









    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True







    share|improve this answer














    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 2 days ago









    Henrik Schumacher

    44.9k265130




    44.9k265130










    answered Nov 13 at 6:31









    kglr

    171k8193399




    171k8193399












    • Beat me to it - BinCounts is the way... +1
      – ciao
      Nov 13 at 6:49






    • 1




      Hey @ciao, you are back?!!
      – kglr
      Nov 13 at 6:53










    • Sorry @Henrik; thanks for the edit.
      – kglr
      2 days ago






    • 1




      Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 days ago










    • @kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
      – ciao
      2 days ago


















    • Beat me to it - BinCounts is the way... +1
      – ciao
      Nov 13 at 6:49






    • 1




      Hey @ciao, you are back?!!
      – kglr
      Nov 13 at 6:53










    • Sorry @Henrik; thanks for the edit.
      – kglr
      2 days ago






    • 1




      Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 days ago










    • @kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
      – ciao
      2 days ago
















    Beat me to it - BinCounts is the way... +1
    – ciao
    Nov 13 at 6:49




    Beat me to it - BinCounts is the way... +1
    – ciao
    Nov 13 at 6:49




    1




    1




    Hey @ciao, you are back?!!
    – kglr
    Nov 13 at 6:53




    Hey @ciao, you are back?!!
    – kglr
    Nov 13 at 6:53












    Sorry @Henrik; thanks for the edit.
    – kglr
    2 days ago




    Sorry @Henrik; thanks for the edit.
    – kglr
    2 days ago




    1




    1




    Short and fast. No need sorting.. Excellent!!... +1
    – Okkes Dulgerci
    2 days ago




    Short and fast. No need sorting.. Excellent!!... +1
    – Okkes Dulgerci
    2 days ago












    @kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
    – ciao
    2 days ago




    @kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
    – ciao
    2 days ago










    up vote
    12
    down vote













    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True







    share|improve this answer



















    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      Nov 13 at 3:55












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      Nov 13 at 4:01















    up vote
    12
    down vote













    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True







    share|improve this answer



















    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      Nov 13 at 3:55












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      Nov 13 at 4:01













    up vote
    12
    down vote










    up vote
    12
    down vote









    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True







    share|improve this answer














    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 2 days ago

























    answered Nov 13 at 3:04









    J42161217

    2,893219




    2,893219








    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      Nov 13 at 3:55












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      Nov 13 at 4:01














    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      Nov 13 at 3:55












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      Nov 13 at 4:01








    1




    1




    These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
    – Okkes Dulgerci
    Nov 13 at 3:55






    These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
    – Okkes Dulgerci
    Nov 13 at 3:55














    Hey, thanks for checking. I will use your timing. Thanks for the confirmation
    – J42161217
    Nov 13 at 4:01




    Hey, thanks for checking. I will use your timing. Thanks for the confirmation
    – J42161217
    Nov 13 at 4:01










    up vote
    5
    down vote













    s = Sort[RandomInteger[{1, 100000}, 10000]];


    Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



    This can be done with this function:



    mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
    If[(Head[rules] === Rule) && (rules[[1]] === {}),
    rules[[2]],
    With[{spopt = SystemOptions["SparseArrayOptions"]},
    Internal`WithLocalSettings[
    SetSystemOptions[
    "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
    SparseArray[rules, dims, background],
    SetSystemOptions[spopt]]
    ]
    ]


    So, in total, r can be obtained as follows:



    r4 = Accumulate[
    mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
    ]; // RepeatedTiming // First



    0.00055




    For comparison:



    r3 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First
    r3 == r4



    0.116



    True







    share|improve this answer



























      up vote
      5
      down vote













      s = Sort[RandomInteger[{1, 100000}, 10000]];


      Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



      This can be done with this function:



      mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
      If[(Head[rules] === Rule) && (rules[[1]] === {}),
      rules[[2]],
      With[{spopt = SystemOptions["SparseArrayOptions"]},
      Internal`WithLocalSettings[
      SetSystemOptions[
      "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
      SparseArray[rules, dims, background],
      SetSystemOptions[spopt]]
      ]
      ]


      So, in total, r can be obtained as follows:



      r4 = Accumulate[
      mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
      ]; // RepeatedTiming // First



      0.00055




      For comparison:



      r3 = Flatten[
      Join[{Table[0, s[[1]] - 1]},
      Table[Table[i, Differences[s][[i]]], {i,
      Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
      RepeatedTiming // First
      r3 == r4



      0.116



      True







      share|improve this answer

























        up vote
        5
        down vote










        up vote
        5
        down vote









        s = Sort[RandomInteger[{1, 100000}, 10000]];


        Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



        This can be done with this function:



        mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
        If[(Head[rules] === Rule) && (rules[[1]] === {}),
        rules[[2]],
        With[{spopt = SystemOptions["SparseArrayOptions"]},
        Internal`WithLocalSettings[
        SetSystemOptions[
        "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
        SparseArray[rules, dims, background],
        SetSystemOptions[spopt]]
        ]
        ]


        So, in total, r can be obtained as follows:



        r4 = Accumulate[
        mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
        ]; // RepeatedTiming // First



        0.00055




        For comparison:



        r3 = Flatten[
        Join[{Table[0, s[[1]] - 1]},
        Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
        RepeatedTiming // First
        r3 == r4



        0.116



        True







        share|improve this answer














        s = Sort[RandomInteger[{1, 100000}, 10000]];


        Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



        This can be done with this function:



        mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
        If[(Head[rules] === Rule) && (rules[[1]] === {}),
        rules[[2]],
        With[{spopt = SystemOptions["SparseArrayOptions"]},
        Internal`WithLocalSettings[
        SetSystemOptions[
        "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
        SparseArray[rules, dims, background],
        SetSystemOptions[spopt]]
        ]
        ]


        So, in total, r can be obtained as follows:



        r4 = Accumulate[
        mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
        ]; // RepeatedTiming // First



        0.00055




        For comparison:



        r3 = Flatten[
        Join[{Table[0, s[[1]] - 1]},
        Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
        RepeatedTiming // First
        r3 == r4



        0.116



        True








        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 2 days ago

























        answered Nov 13 at 6:32









        Henrik Schumacher

        44.9k265130




        44.9k265130






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f185874%2fhow-to-construct-a-list-of-lengths-efficiently%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Morgemoulin

            Scott Moir

            Souastre