Training loss and accuracy don't change - stuck around a value

Training loss and accuracy don't change - stuck around a value — Tensorflow

My LSTM model below in tensorflow works without any error, but the values of the loss and accuracy seem to swarm around specific values, leading to a misleading conclusion after all the epochs.

My thoughts were/are:

It might be a problem of feeding the same batch each time. I think this cannot be the problem because I get the printing after each 10th epoch; so that I'm feeding different batches of the data.

It might be a cause of the low complexity of the LSTM network. However, I believe this cannot be the problem because the training phase should somehow improve over time because of the optimization.

As the loss and accuracy are around some value, I might be calculating them or printing them wrongly, leading to this values althought I don't get my error in the code.

I think that this issue is due to some typo or misunderstanding in my code, because in some other case, I would see the loss and accuracy either increase or decrease, even with not constant patterns but not this way.

Finally, if all my code is well arranged, I think that this values could be due to bad hyperparameters settings. My following steps could be to increase the complexity of the LSTM network, lowering the learning rate and increasing the batch size.

As there is not a lot of documentation (and due to my little experience) about linking the dataset API, together with RNNs, different batches and different datasets, I may have done something wrongly.

Please, I would appreciate any review or knowdledge that could add some light to this.

Many thanks for your time.

CODE:

'''All the data is numeric. The dataset is scaled using the StandardScaler from scikit-learn and in the following shapes:'''



#The 3D xt array has a shape of: (11, 69579, 74)

#The 3D xval array has a shape of: (11, 7732, 74)



#y shape is: (69579, 3)

#yval shape is: (7732, 3)



N_TIMESTEPS_X = xt.shape[0] ## The stack number

BATCH_SIZE = 256

#N_OBSERVATIONS = xt.shape[1]

N_FEATURES = xt.shape[2]

N_OUTPUTS = yt.shape[1]

N_NEURONS_LSTM = 128 ## Number of units in the LSTMCell 

N_EPOCHS = 600

LEARNING_RATE = 0.1



### Define the placeholders anda gather the data.

xt = xt.transpose([1,0,2])

xval = xval.transpose([1,0,2])



train_data = (xt, yt)

validation_data = (xval, yval)



## We define the placeholders as a trick so that we do not break into memory problems, associated with feeding the data directly.



'''As an alternative, you can define the Dataset in terms of tf.placeholder() tensors, and feed the NumPy arrays when you initialize an Iterator over the dataset.'''



batch_size = tf.placeholder(tf.int64)



x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')



y = tf.placeholder(tf.float32, shape=[None, N_OUTPUTS], name='YPlaceholder')



# Creating the two different dataset objects.



train_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE).repeat()



val_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE)



# Creating the Iterator type that permits to switch between datasets.



itr = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)

train_init_op = itr.make_initializer(train_dataset)

validation_init_op = itr.make_initializer(val_dataset)



next_features, next_labels = itr.get_next()



### Create the graph 



cellType = tf.nn.rnn_cell.LSTMCell(num_units=N_NEURONS_LSTM, name='LSTMCell')



inputs = tf.unstack(next_features, axis=1)



'''inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]'''



RNNOutputs, _ = tf.nn.static_rnn(cell=cellType, inputs=inputs, dtype=tf.float32)



out_weights = tf.get_variable("out_weights", shape=[N_NEURONS_LSTM, N_OUTPUTS], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer())

out_bias = tf.get_variable("out_bias", shape=[N_OUTPUTS], dtype=tf.float32, initializer=tf.zeros_initializer())



predictionsLayer = tf.matmul(RNNOutputs[-1], out_weights) + out_bias



### Define the cost function, that will be optimized by the optimizer. 



cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictionsLayer, labels=next_labels, name='Softmax_plus_Cross_Entropy'))



optimizer_type = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='AdamOptimizer')



optimizer = optimizer_type.minimize(cost)



### Model evaluation 



correctPrediction = tf.equal(tf.argmax(predictionsLayer,1), tf.argmax(next_labels,1))



accuracy = tf.reduce_mean(tf.cast(correctPrediction,tf.float32))



N_BATCHES = train_data[0].shape[0] // BATCH_SIZE



## Saving variables so that we can restore them afterwards.



saver = tf.train.Saver()



save_dir = '/home/zmlaptop/Desktop/tfModels/{}_{}'.format(cellType.__class__.__name__, datetime.now().strftime("%Y%m%d%H%M%S"))

os.mkdir(save_dir)



varDict = {'nTimeSteps':N_TIMESTEPS_X, 'BatchSize': BATCH_SIZE, 'nFeatures':N_FEATURES,

           'nNeuronsLSTM':N_NEURONS_LSTM, 'nEpochs':N_EPOCHS,

           'learningRate':LEARNING_RATE, 'optimizerType': optimizer_type.__class__.__name__}



varDicSavingTxt = save_dir + '/varDict.txt'

modelFilesDir = save_dir + '/modelFiles'

os.mkdir(modelFilesDir)



logDir = save_dir + '/TBoardLogs'

os.mkdir(logDir)



acc_summary = tf.summary.scalar('Accuracy', accuracy)

loss_summary = tf.summary.scalar('Cost_CrossEntropy', cost)

summary_merged = tf.summary.merge_all()



with open(varDicSavingTxt, 'w') as outfile:

    outfile.write(repr(varDict))



with tf.Session() as sess:



    tf.set_random_seed(2)

    sess.run(tf.global_variables_initializer())

    train_writer = tf.summary.FileWriter(logDir + '/train', sess.graph)

    validation_writer = tf.summary.FileWriter(logDir + '/validation')



    # initialise iterator with train data



    sess.run(train_init_op, feed_dict = {x : train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



    print('¡Training starts!')

    for epoch in range(N_EPOCHS):



        batchAccList = 

        tot_loss = 0



        for batch in range(N_BATCHES):



            optimizer_output, loss_value, summary, accBatch = sess.run([optimizer, cost, summary_merged, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

            tot_loss += loss_value

            batchAccList.append(accBatch)



            if batch % 10 == 0:



                train_writer.add_summary(summary, batch)



        epochAcc = tf.reduce_mean(batchAccList)

        epochAcc_num = sess.run(epochAcc, feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



        if epoch%10 == 0:



            print("Epoch: {}, Loss: {:.4f}, Accuracy: {}".format(epoch, tot_loss / N_BATCHES, epochAcc_num))



    # initialise iterator with validation data



    sess.run(validation_init_op, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size:len(validation_data[0])})



    valLoss, valAcc = sess.run([cost, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

    print('Validation Loss: {:4f}, Validation Accuracy: {}'.format(valLoss, valAcc))



    summary_val = sess.run(summary_merged, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size: len(validation_data[0])})



    validation_writer.add_summary(summary_val)



    saver.save(sess, modelFilesDir)

This is the output in tensorboard for accuracy and loss (stopped at around 250 epochs):

enter image description here

asked 2 hours ago

Ezarate11

New contributor

add a comment |

My LSTM model below in tensorflow works without any error, but the values of the loss and accuracy seem to swarm around specific values, leading to a misleading conclusion after all the epochs.

My thoughts were/are:

It might be a problem of feeding the same batch each time. I think this cannot be the problem because I get the printing after each 10th epoch; so that I'm feeding different batches of the data.

It might be a cause of the low complexity of the LSTM network. However, I believe this cannot be the problem because the training phase should somehow improve over time because of the optimization.

As the loss and accuracy are around some value, I might be calculating them or printing them wrongly, leading to this values althought I don't get my error in the code.

I think that this issue is due to some typo or misunderstanding in my code, because in some other case, I would see the loss and accuracy either increase or decrease, even with not constant patterns but not this way.

Finally, if all my code is well arranged, I think that this values could be due to bad hyperparameters settings. My following steps could be to increase the complexity of the LSTM network, lowering the learning rate and increasing the batch size.

Please, I would appreciate any review or knowdledge that could add some light to this.

Many thanks for your time.

CODE:

'''All the data is numeric. The dataset is scaled using the StandardScaler from scikit-learn and in the following shapes:'''



#The 3D xt array has a shape of: (11, 69579, 74)

#The 3D xval array has a shape of: (11, 7732, 74)



#y shape is: (69579, 3)

#yval shape is: (7732, 3)



N_TIMESTEPS_X = xt.shape[0] ## The stack number

BATCH_SIZE = 256

#N_OBSERVATIONS = xt.shape[1]

N_FEATURES = xt.shape[2]

N_OUTPUTS = yt.shape[1]

N_NEURONS_LSTM = 128 ## Number of units in the LSTMCell 

N_EPOCHS = 600

LEARNING_RATE = 0.1



### Define the placeholders anda gather the data.

xt = xt.transpose([1,0,2])

xval = xval.transpose([1,0,2])



train_data = (xt, yt)

validation_data = (xval, yval)



## We define the placeholders as a trick so that we do not break into memory problems, associated with feeding the data directly.



'''As an alternative, you can define the Dataset in terms of tf.placeholder() tensors, and feed the NumPy arrays when you initialize an Iterator over the dataset.'''



batch_size = tf.placeholder(tf.int64)



x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')



y = tf.placeholder(tf.float32, shape=[None, N_OUTPUTS], name='YPlaceholder')



# Creating the two different dataset objects.



train_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE).repeat()



val_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE)



# Creating the Iterator type that permits to switch between datasets.



itr = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)

train_init_op = itr.make_initializer(train_dataset)

validation_init_op = itr.make_initializer(val_dataset)



next_features, next_labels = itr.get_next()



### Create the graph 



cellType = tf.nn.rnn_cell.LSTMCell(num_units=N_NEURONS_LSTM, name='LSTMCell')



inputs = tf.unstack(next_features, axis=1)



'''inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]'''



RNNOutputs, _ = tf.nn.static_rnn(cell=cellType, inputs=inputs, dtype=tf.float32)



out_weights = tf.get_variable("out_weights", shape=[N_NEURONS_LSTM, N_OUTPUTS], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer())

out_bias = tf.get_variable("out_bias", shape=[N_OUTPUTS], dtype=tf.float32, initializer=tf.zeros_initializer())



predictionsLayer = tf.matmul(RNNOutputs[-1], out_weights) + out_bias



### Define the cost function, that will be optimized by the optimizer. 



cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictionsLayer, labels=next_labels, name='Softmax_plus_Cross_Entropy'))



optimizer_type = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='AdamOptimizer')



optimizer = optimizer_type.minimize(cost)



### Model evaluation 



correctPrediction = tf.equal(tf.argmax(predictionsLayer,1), tf.argmax(next_labels,1))



accuracy = tf.reduce_mean(tf.cast(correctPrediction,tf.float32))



N_BATCHES = train_data[0].shape[0] // BATCH_SIZE



## Saving variables so that we can restore them afterwards.



saver = tf.train.Saver()



save_dir = '/home/zmlaptop/Desktop/tfModels/{}_{}'.format(cellType.__class__.__name__, datetime.now().strftime("%Y%m%d%H%M%S"))

os.mkdir(save_dir)



varDict = {'nTimeSteps':N_TIMESTEPS_X, 'BatchSize': BATCH_SIZE, 'nFeatures':N_FEATURES,

           'nNeuronsLSTM':N_NEURONS_LSTM, 'nEpochs':N_EPOCHS,

           'learningRate':LEARNING_RATE, 'optimizerType': optimizer_type.__class__.__name__}



varDicSavingTxt = save_dir + '/varDict.txt'

modelFilesDir = save_dir + '/modelFiles'

os.mkdir(modelFilesDir)



logDir = save_dir + '/TBoardLogs'

os.mkdir(logDir)



acc_summary = tf.summary.scalar('Accuracy', accuracy)

loss_summary = tf.summary.scalar('Cost_CrossEntropy', cost)

summary_merged = tf.summary.merge_all()



with open(varDicSavingTxt, 'w') as outfile:

    outfile.write(repr(varDict))



with tf.Session() as sess:



    tf.set_random_seed(2)

    sess.run(tf.global_variables_initializer())

    train_writer = tf.summary.FileWriter(logDir + '/train', sess.graph)

    validation_writer = tf.summary.FileWriter(logDir + '/validation')



    # initialise iterator with train data



    sess.run(train_init_op, feed_dict = {x : train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



    print('¡Training starts!')

    for epoch in range(N_EPOCHS):



        batchAccList = 

        tot_loss = 0



        for batch in range(N_BATCHES):



            optimizer_output, loss_value, summary, accBatch = sess.run([optimizer, cost, summary_merged, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

            tot_loss += loss_value

            batchAccList.append(accBatch)



            if batch % 10 == 0:



                train_writer.add_summary(summary, batch)



        epochAcc = tf.reduce_mean(batchAccList)

        epochAcc_num = sess.run(epochAcc, feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



        if epoch%10 == 0:



            print("Epoch: {}, Loss: {:.4f}, Accuracy: {}".format(epoch, tot_loss / N_BATCHES, epochAcc_num))



    # initialise iterator with validation data



    sess.run(validation_init_op, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size:len(validation_data[0])})



    valLoss, valAcc = sess.run([cost, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

    print('Validation Loss: {:4f}, Validation Accuracy: {}'.format(valLoss, valAcc))



    summary_val = sess.run(summary_merged, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size: len(validation_data[0])})



    validation_writer.add_summary(summary_val)



    saver.save(sess, modelFilesDir)

This is the output in tensorboard for accuracy and loss (stopped at around 250 epochs):

enter image description here

asked 2 hours ago

Ezarate11

New contributor

add a comment |

My LSTM model below in tensorflow works without any error, but the values of the loss and accuracy seem to swarm around specific values, leading to a misleading conclusion after all the epochs.

My thoughts were/are:

It might be a problem of feeding the same batch each time. I think this cannot be the problem because I get the printing after each 10th epoch; so that I'm feeding different batches of the data.

It might be a cause of the low complexity of the LSTM network. However, I believe this cannot be the problem because the training phase should somehow improve over time because of the optimization.

As the loss and accuracy are around some value, I might be calculating them or printing them wrongly, leading to this values althought I don't get my error in the code.

I think that this issue is due to some typo or misunderstanding in my code, because in some other case, I would see the loss and accuracy either increase or decrease, even with not constant patterns but not this way.

Finally, if all my code is well arranged, I think that this values could be due to bad hyperparameters settings. My following steps could be to increase the complexity of the LSTM network, lowering the learning rate and increasing the batch size.

Please, I would appreciate any review or knowdledge that could add some light to this.

Many thanks for your time.

CODE:

'''All the data is numeric. The dataset is scaled using the StandardScaler from scikit-learn and in the following shapes:'''



#The 3D xt array has a shape of: (11, 69579, 74)

#The 3D xval array has a shape of: (11, 7732, 74)



#y shape is: (69579, 3)

#yval shape is: (7732, 3)



N_TIMESTEPS_X = xt.shape[0] ## The stack number

BATCH_SIZE = 256

#N_OBSERVATIONS = xt.shape[1]

N_FEATURES = xt.shape[2]

N_OUTPUTS = yt.shape[1]

N_NEURONS_LSTM = 128 ## Number of units in the LSTMCell 

N_EPOCHS = 600

LEARNING_RATE = 0.1



### Define the placeholders anda gather the data.

xt = xt.transpose([1,0,2])

xval = xval.transpose([1,0,2])



train_data = (xt, yt)

validation_data = (xval, yval)



## We define the placeholders as a trick so that we do not break into memory problems, associated with feeding the data directly.



'''As an alternative, you can define the Dataset in terms of tf.placeholder() tensors, and feed the NumPy arrays when you initialize an Iterator over the dataset.'''



batch_size = tf.placeholder(tf.int64)



x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')



y = tf.placeholder(tf.float32, shape=[None, N_OUTPUTS], name='YPlaceholder')



# Creating the two different dataset objects.



train_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE).repeat()



val_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE)



# Creating the Iterator type that permits to switch between datasets.



itr = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)

train_init_op = itr.make_initializer(train_dataset)

validation_init_op = itr.make_initializer(val_dataset)



next_features, next_labels = itr.get_next()



### Create the graph 



cellType = tf.nn.rnn_cell.LSTMCell(num_units=N_NEURONS_LSTM, name='LSTMCell')



inputs = tf.unstack(next_features, axis=1)



'''inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]'''



RNNOutputs, _ = tf.nn.static_rnn(cell=cellType, inputs=inputs, dtype=tf.float32)



out_weights = tf.get_variable("out_weights", shape=[N_NEURONS_LSTM, N_OUTPUTS], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer())

out_bias = tf.get_variable("out_bias", shape=[N_OUTPUTS], dtype=tf.float32, initializer=tf.zeros_initializer())



predictionsLayer = tf.matmul(RNNOutputs[-1], out_weights) + out_bias



### Define the cost function, that will be optimized by the optimizer. 



cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictionsLayer, labels=next_labels, name='Softmax_plus_Cross_Entropy'))



optimizer_type = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='AdamOptimizer')



optimizer = optimizer_type.minimize(cost)



### Model evaluation 



correctPrediction = tf.equal(tf.argmax(predictionsLayer,1), tf.argmax(next_labels,1))



accuracy = tf.reduce_mean(tf.cast(correctPrediction,tf.float32))



N_BATCHES = train_data[0].shape[0] // BATCH_SIZE



## Saving variables so that we can restore them afterwards.



saver = tf.train.Saver()



save_dir = '/home/zmlaptop/Desktop/tfModels/{}_{}'.format(cellType.__class__.__name__, datetime.now().strftime("%Y%m%d%H%M%S"))

os.mkdir(save_dir)



varDict = {'nTimeSteps':N_TIMESTEPS_X, 'BatchSize': BATCH_SIZE, 'nFeatures':N_FEATURES,

           'nNeuronsLSTM':N_NEURONS_LSTM, 'nEpochs':N_EPOCHS,

           'learningRate':LEARNING_RATE, 'optimizerType': optimizer_type.__class__.__name__}



varDicSavingTxt = save_dir + '/varDict.txt'

modelFilesDir = save_dir + '/modelFiles'

os.mkdir(modelFilesDir)



logDir = save_dir + '/TBoardLogs'

os.mkdir(logDir)



acc_summary = tf.summary.scalar('Accuracy', accuracy)

loss_summary = tf.summary.scalar('Cost_CrossEntropy', cost)

summary_merged = tf.summary.merge_all()



with open(varDicSavingTxt, 'w') as outfile:

    outfile.write(repr(varDict))



with tf.Session() as sess:



    tf.set_random_seed(2)

    sess.run(tf.global_variables_initializer())

    train_writer = tf.summary.FileWriter(logDir + '/train', sess.graph)

    validation_writer = tf.summary.FileWriter(logDir + '/validation')



    # initialise iterator with train data



    sess.run(train_init_op, feed_dict = {x : train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



    print('¡Training starts!')

    for epoch in range(N_EPOCHS):



        batchAccList = 

        tot_loss = 0



        for batch in range(N_BATCHES):



            optimizer_output, loss_value, summary, accBatch = sess.run([optimizer, cost, summary_merged, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

            tot_loss += loss_value

            batchAccList.append(accBatch)



            if batch % 10 == 0:



                train_writer.add_summary(summary, batch)



        epochAcc = tf.reduce_mean(batchAccList)

        epochAcc_num = sess.run(epochAcc, feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



        if epoch%10 == 0:



            print("Epoch: {}, Loss: {:.4f}, Accuracy: {}".format(epoch, tot_loss / N_BATCHES, epochAcc_num))



    # initialise iterator with validation data



    sess.run(validation_init_op, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size:len(validation_data[0])})



    valLoss, valAcc = sess.run([cost, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

    print('Validation Loss: {:4f}, Validation Accuracy: {}'.format(valLoss, valAcc))



    summary_val = sess.run(summary_merged, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size: len(validation_data[0])})



    validation_writer.add_summary(summary_val)



    saver.save(sess, modelFilesDir)

This is the output in tensorboard for accuracy and loss (stopped at around 250 epochs):

enter image description here

asked 2 hours ago

Ezarate11

New contributor

My LSTM model below in tensorflow works without any error, but the values of the loss and accuracy seem to swarm around specific values, leading to a misleading conclusion after all the epochs.

My thoughts were/are:

It might be a problem of feeding the same batch each time. I think this cannot be the problem because I get the printing after each 10th epoch; so that I'm feeding different batches of the data.

It might be a cause of the low complexity of the LSTM network. However, I believe this cannot be the problem because the training phase should somehow improve over time because of the optimization.

As the loss and accuracy are around some value, I might be calculating them or printing them wrongly, leading to this values althought I don't get my error in the code.

I think that this issue is due to some typo or misunderstanding in my code, because in some other case, I would see the loss and accuracy either increase or decrease, even with not constant patterns but not this way.

Finally, if all my code is well arranged, I think that this values could be due to bad hyperparameters settings. My following steps could be to increase the complexity of the LSTM network, lowering the learning rate and increasing the batch size.

Please, I would appreciate any review or knowdledge that could add some light to this.

Many thanks for your time.

CODE:

'''All the data is numeric. The dataset is scaled using the StandardScaler from scikit-learn and in the following shapes:'''



#The 3D xt array has a shape of: (11, 69579, 74)

#The 3D xval array has a shape of: (11, 7732, 74)



#y shape is: (69579, 3)

#yval shape is: (7732, 3)



N_TIMESTEPS_X = xt.shape[0] ## The stack number

BATCH_SIZE = 256

#N_OBSERVATIONS = xt.shape[1]

N_FEATURES = xt.shape[2]

N_OUTPUTS = yt.shape[1]

N_NEURONS_LSTM = 128 ## Number of units in the LSTMCell 

N_EPOCHS = 600

LEARNING_RATE = 0.1



### Define the placeholders anda gather the data.

xt = xt.transpose([1,0,2])

xval = xval.transpose([1,0,2])



train_data = (xt, yt)

validation_data = (xval, yval)



## We define the placeholders as a trick so that we do not break into memory problems, associated with feeding the data directly.



'''As an alternative, you can define the Dataset in terms of tf.placeholder() tensors, and feed the NumPy arrays when you initialize an Iterator over the dataset.'''



batch_size = tf.placeholder(tf.int64)



x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')



y = tf.placeholder(tf.float32, shape=[None, N_OUTPUTS], name='YPlaceholder')



# Creating the two different dataset objects.



train_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE).repeat()



val_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE)



# Creating the Iterator type that permits to switch between datasets.



itr = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)

train_init_op = itr.make_initializer(train_dataset)

validation_init_op = itr.make_initializer(val_dataset)



next_features, next_labels = itr.get_next()



### Create the graph 



cellType = tf.nn.rnn_cell.LSTMCell(num_units=N_NEURONS_LSTM, name='LSTMCell')



inputs = tf.unstack(next_features, axis=1)



'''inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]'''



RNNOutputs, _ = tf.nn.static_rnn(cell=cellType, inputs=inputs, dtype=tf.float32)



out_weights = tf.get_variable("out_weights", shape=[N_NEURONS_LSTM, N_OUTPUTS], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer())

out_bias = tf.get_variable("out_bias", shape=[N_OUTPUTS], dtype=tf.float32, initializer=tf.zeros_initializer())



predictionsLayer = tf.matmul(RNNOutputs[-1], out_weights) + out_bias



### Define the cost function, that will be optimized by the optimizer. 



cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictionsLayer, labels=next_labels, name='Softmax_plus_Cross_Entropy'))



optimizer_type = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='AdamOptimizer')



optimizer = optimizer_type.minimize(cost)



### Model evaluation 



correctPrediction = tf.equal(tf.argmax(predictionsLayer,1), tf.argmax(next_labels,1))



accuracy = tf.reduce_mean(tf.cast(correctPrediction,tf.float32))



N_BATCHES = train_data[0].shape[0] // BATCH_SIZE



## Saving variables so that we can restore them afterwards.



saver = tf.train.Saver()



save_dir = '/home/zmlaptop/Desktop/tfModels/{}_{}'.format(cellType.__class__.__name__, datetime.now().strftime("%Y%m%d%H%M%S"))

os.mkdir(save_dir)



varDict = {'nTimeSteps':N_TIMESTEPS_X, 'BatchSize': BATCH_SIZE, 'nFeatures':N_FEATURES,

           'nNeuronsLSTM':N_NEURONS_LSTM, 'nEpochs':N_EPOCHS,

           'learningRate':LEARNING_RATE, 'optimizerType': optimizer_type.__class__.__name__}



varDicSavingTxt = save_dir + '/varDict.txt'

modelFilesDir = save_dir + '/modelFiles'

os.mkdir(modelFilesDir)



logDir = save_dir + '/TBoardLogs'

os.mkdir(logDir)



acc_summary = tf.summary.scalar('Accuracy', accuracy)

loss_summary = tf.summary.scalar('Cost_CrossEntropy', cost)

summary_merged = tf.summary.merge_all()



with open(varDicSavingTxt, 'w') as outfile:

    outfile.write(repr(varDict))



with tf.Session() as sess:



    tf.set_random_seed(2)

    sess.run(tf.global_variables_initializer())

    train_writer = tf.summary.FileWriter(logDir + '/train', sess.graph)

    validation_writer = tf.summary.FileWriter(logDir + '/validation')



    # initialise iterator with train data



    sess.run(train_init_op, feed_dict = {x : train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



    print('¡Training starts!')

    for epoch in range(N_EPOCHS):



        batchAccList = 

        tot_loss = 0



        for batch in range(N_BATCHES):



            optimizer_output, loss_value, summary, accBatch = sess.run([optimizer, cost, summary_merged, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

            tot_loss += loss_value

            batchAccList.append(accBatch)



            if batch % 10 == 0:



                train_writer.add_summary(summary, batch)



        epochAcc = tf.reduce_mean(batchAccList)

        epochAcc_num = sess.run(epochAcc, feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})



        if epoch%10 == 0:



            print("Epoch: {}, Loss: {:.4f}, Accuracy: {}".format(epoch, tot_loss / N_BATCHES, epochAcc_num))



    # initialise iterator with validation data



    sess.run(validation_init_op, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size:len(validation_data[0])})



    valLoss, valAcc = sess.run([cost, accuracy], feed_dict = {x: train_data[0], y: train_data[1], batch_size: BATCH_SIZE})

    print('Validation Loss: {:4f}, Validation Accuracy: {}'.format(valLoss, valAcc))



    summary_val = sess.run(summary_merged, feed_dict = {x: validation_data[0], y: validation_data[1], batch_size: len(validation_data[0])})



    validation_writer.add_summary(summary_val)



    saver.save(sess, modelFilesDir)

This is the output in tensorboard for accuracy and loss (stopped at around 250 epochs):

enter image description here

python-3.x tensorflow

asked 2 hours ago

Ezarate11

New contributor

asked 2 hours ago

Ezarate11

New contributor

asked 2 hours ago

Ezarate11

New contributor

asked 2 hours ago

Ezarate11

asked 2 hours ago

Ezarate11

New contributor

Ezarate11 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Ezarate11 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210572%2ftraining-loss-and-accuracy-dont-change-stuck-around-a-value-tensorflow%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

Ezarate11 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Ezarate11 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk