Get stacked game state in NHWC format
up vote
1
down vote
favorite
After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).
I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.
My code
There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.
example.py
#!/usr/bin/env python
import numpy as np
class Agent():
def __init__(self):
"""Initialize the agent with given attributes."""
self.frames = None
self.nb_frames = 4
def get_game_data(self, state):
"""Create a list with 4 frames and append/pop them each frame."""
frame = state
if self.frames is None:
self.frames = [frame] * self.nb_frames
else:
self.frames.append(frame)
self.frames.pop(0)
# from (4, 10, 10) to (1, 4, 10, 10)
expanded_frames = np.expand_dims(self.frames, 0)
# From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])
return expanded_frames
board_size = 10
state = np.zeros((board_size, board_size))
agent = Agent()
stacked_state = agent.get_game_data(state)
In order to verify how costly is to transpose, I've executed the code below for two conditions:
- Without transposing (NCHW) = 5.775287926 s;
Transposing (NHWC) = 7.381751397 s.
from timeit import Timer
t = Timer(lambda: agent.get_game_data(state))
print (t.timeit(number = 1000000))
So transposing is responsible for 27% of the running time of the function get_game_data.
Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?
python algorithm object-oriented numpy machine-learning
add a comment |
up vote
1
down vote
favorite
After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).
I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.
My code
There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.
example.py
#!/usr/bin/env python
import numpy as np
class Agent():
def __init__(self):
"""Initialize the agent with given attributes."""
self.frames = None
self.nb_frames = 4
def get_game_data(self, state):
"""Create a list with 4 frames and append/pop them each frame."""
frame = state
if self.frames is None:
self.frames = [frame] * self.nb_frames
else:
self.frames.append(frame)
self.frames.pop(0)
# from (4, 10, 10) to (1, 4, 10, 10)
expanded_frames = np.expand_dims(self.frames, 0)
# From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])
return expanded_frames
board_size = 10
state = np.zeros((board_size, board_size))
agent = Agent()
stacked_state = agent.get_game_data(state)
In order to verify how costly is to transpose, I've executed the code below for two conditions:
- Without transposing (NCHW) = 5.775287926 s;
Transposing (NHWC) = 7.381751397 s.
from timeit import Timer
t = Timer(lambda: agent.get_game_data(state))
print (t.timeit(number = 1000000))
So transposing is responsible for 27% of the running time of the function get_game_data.
Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?
python algorithm object-oriented numpy machine-learning
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).
I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.
My code
There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.
example.py
#!/usr/bin/env python
import numpy as np
class Agent():
def __init__(self):
"""Initialize the agent with given attributes."""
self.frames = None
self.nb_frames = 4
def get_game_data(self, state):
"""Create a list with 4 frames and append/pop them each frame."""
frame = state
if self.frames is None:
self.frames = [frame] * self.nb_frames
else:
self.frames.append(frame)
self.frames.pop(0)
# from (4, 10, 10) to (1, 4, 10, 10)
expanded_frames = np.expand_dims(self.frames, 0)
# From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])
return expanded_frames
board_size = 10
state = np.zeros((board_size, board_size))
agent = Agent()
stacked_state = agent.get_game_data(state)
In order to verify how costly is to transpose, I've executed the code below for two conditions:
- Without transposing (NCHW) = 5.775287926 s;
Transposing (NHWC) = 7.381751397 s.
from timeit import Timer
t = Timer(lambda: agent.get_game_data(state))
print (t.timeit(number = 1000000))
So transposing is responsible for 27% of the running time of the function get_game_data.
Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?
python algorithm object-oriented numpy machine-learning
After reading this, I decided to transition my DQN code from the keras library to tf.keras library (code is located in this repo) and my original code used NCHW format, as it was faster with GPUs. As I need to run in CPUs also, I found out that it ran only with NHWC format on my tf version (1.12 with python 3.7, compiled with AVX2 flags).
I redesigned my code and it's working, but I stack in the NCHW format and then I have to transpose it every function call, which is costly.
My code
There is a 'frames' list sized 'self.nb_frames', which will hold the (10, 10) states. Then I expand_dims and transpose, returning the list in NHWC format.
example.py
#!/usr/bin/env python
import numpy as np
class Agent():
def __init__(self):
"""Initialize the agent with given attributes."""
self.frames = None
self.nb_frames = 4
def get_game_data(self, state):
"""Create a list with 4 frames and append/pop them each frame."""
frame = state
if self.frames is None:
self.frames = [frame] * self.nb_frames
else:
self.frames.append(frame)
self.frames.pop(0)
# from (4, 10, 10) to (1, 4, 10, 10)
expanded_frames = np.expand_dims(self.frames, 0)
# From (1, 4, 10, 10) to (1, 10, 10, 4) | NCHW -> NHWC
expanded_frames = np.transpose(expanded_frames, [0, 3, 2, 1])
return expanded_frames
board_size = 10
state = np.zeros((board_size, board_size))
agent = Agent()
stacked_state = agent.get_game_data(state)
In order to verify how costly is to transpose, I've executed the code below for two conditions:
- Without transposing (NCHW) = 5.775287926 s;
Transposing (NHWC) = 7.381751397 s.
from timeit import Timer
t = Timer(lambda: agent.get_game_data(state))
print (t.timeit(number = 1000000))
So transposing is responsible for 27% of the running time of the function get_game_data.
Is there a better option to create the list directly in the (10, 10, 4) format? Does my code follow best practices?
python algorithm object-oriented numpy machine-learning
python algorithm object-oriented numpy machine-learning
edited Nov 22 at 22:35
asked Nov 22 at 17:39
Neves4
428
428
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208242%2fget-stacked-game-state-in-nhwc-format%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown