Get stacked game state in NHWC format











up vote
1
down vote

favorite












After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).



I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.



My code



There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.



example.py



#!/usr/bin/env python

import numpy as np

class Agent():

def __init__(self):
"""Initialize the agent with given attributes."""
self.frames = None
self.nb_frames = 4

def get_game_data(self, state):
"""Create a list with 4 frames and append/pop them each frame."""
frame = state

if self.frames is None:
self.frames = [frame] * self.nb_frames
else:
self.frames.append(frame)
self.frames.pop(0)

# from (4, 10, 10) to (1, 4, 10, 10)
expanded_frames = np.expand_dims(self.frames, 0)

# From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])

return expanded_frames


board_size = 10
state = np.zeros((board_size, board_size))
agent = Agent()
stacked_state = agent.get_game_data(state)


In order to verify how costly is to transpose, I've executed the code below for two conditions:




  1. Without transposing (NCHW) = 5.775287926 s;


  2. Transposing (NHWC) = 7.381751397 s.



    from timeit import Timer

    t = Timer(lambda: agent.get_game_data(state))
    print (t.timeit(number = 1000000))



So transposing is responsible for 27% of the running time of the function get_game_data.



Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?










share|improve this question




























    up vote
    1
    down vote

    favorite












    After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).



    I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.



    My code



    There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.



    example.py



    #!/usr/bin/env python

    import numpy as np

    class Agent():

    def __init__(self):
    """Initialize the agent with given attributes."""
    self.frames = None
    self.nb_frames = 4

    def get_game_data(self, state):
    """Create a list with 4 frames and append/pop them each frame."""
    frame = state

    if self.frames is None:
    self.frames = [frame] * self.nb_frames
    else:
    self.frames.append(frame)
    self.frames.pop(0)

    # from (4, 10, 10) to (1, 4, 10, 10)
    expanded_frames = np.expand_dims(self.frames, 0)

    # From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
    expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])

    return expanded_frames


    board_size = 10
    state = np.zeros((board_size, board_size))
    agent = Agent()
    stacked_state = agent.get_game_data(state)


    In order to verify how costly is to transpose, I've executed the code below for two conditions:




    1. Without transposing (NCHW) = 5.775287926 s;


    2. Transposing (NHWC) = 7.381751397 s.



      from timeit import Timer

      t = Timer(lambda: agent.get_game_data(state))
      print (t.timeit(number = 1000000))



    So transposing is responsible for 27% of the running time of the function get_game_data.



    Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?










    share|improve this question


























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).



      I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.



      My code



      There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.



      example.py



      #!/usr/bin/env python

      import numpy as np

      class Agent():

      def __init__(self):
      """Initialize the agent with given attributes."""
      self.frames = None
      self.nb_frames = 4

      def get_game_data(self, state):
      """Create a list with 4 frames and append/pop them each frame."""
      frame = state

      if self.frames is None:
      self.frames = [frame] * self.nb_frames
      else:
      self.frames.append(frame)
      self.frames.pop(0)

      # from (4, 10, 10) to (1, 4, 10, 10)
      expanded_frames = np.expand_dims(self.frames, 0)

      # From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
      expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])

      return expanded_frames


      board_size = 10
      state = np.zeros((board_size, board_size))
      agent = Agent()
      stacked_state = agent.get_game_data(state)


      In order to verify how costly is to transpose, I've executed the code below for two conditions:




      1. Without transposing (NCHW) = 5.775287926 s;


      2. Transposing (NHWC) = 7.381751397 s.



        from timeit import Timer

        t = Timer(lambda: agent.get_game_data(state))
        print (t.timeit(number = 1000000))



      So transposing is responsible for 27% of the running time of the function get_game_data.



      Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?










      share|improve this question















      After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).



      I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.



      My code



      There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.



      example.py



      #!/usr/bin/env python

      import numpy as np

      class Agent():

      def __init__(self):
      """Initialize the agent with given attributes."""
      self.frames = None
      self.nb_frames = 4

      def get_game_data(self, state):
      """Create a list with 4 frames and append/pop them each frame."""
      frame = state

      if self.frames is None:
      self.frames = [frame] * self.nb_frames
      else:
      self.frames.append(frame)
      self.frames.pop(0)

      # from (4, 10, 10) to (1, 4, 10, 10)
      expanded_frames = np.expand_dims(self.frames, 0)

      # From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
      expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])

      return expanded_frames


      board_size = 10
      state = np.zeros((board_size, board_size))
      agent = Agent()
      stacked_state = agent.get_game_data(state)


      In order to verify how costly is to transpose, I've executed the code below for two conditions:




      1. Without transposing (NCHW) = 5.775287926 s;


      2. Transposing (NHWC) = 7.381751397 s.



        from timeit import Timer

        t = Timer(lambda: agent.get_game_data(state))
        print (t.timeit(number = 1000000))



      So transposing is responsible for 27% of the running time of the function get_game_data.



      Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?







      python algorithm object-oriented numpy machine-learning






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 22 at 22:35

























      asked Nov 22 at 17:39









      Neves4

      428




      428



























          active

          oldest

          votes











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "196"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208242%2fget-stacked-game-state-in-nhwc-format%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown






























          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208242%2fget-stacked-game-state-in-nhwc-format%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Morgemoulin

          Scott Moir

          Souastre