Skip to content

Ability to save a trained model #932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Yandychang1 opened this issue Apr 16, 2022 · 29 comments
Closed

Ability to save a trained model #932

Yandychang1 opened this issue Apr 16, 2022 · 29 comments
Assignees
Labels
missing feature/s An issue about missing features in the library.

Comments

@Yandychang1
Copy link

Is there an example of how to save a trained model, I have spent a significant amount of time trying to figure it out, but I am unsuccessful.

@bigbugcc
Copy link
Contributor

keras:
var model = Training();
model.save_weights("your_save_path_file.h5");

is this what you want ?

@Yandychang1
Copy link
Author

Yandychang1 commented Apr 25, 2022

I think like that it will give me a FileNotFound exception in HDF5CSharp.dll.
Besides I think if I save the weights only in order to load them, I would need to re define the network, and that option may not be always available.
Mi goal was to save the files as would be saved by tensorflow 2 so I could then save as ONNX file.
Inability to easily achieve this is what keeps me from using this library instead of CNTK.

@Yandychang1
Copy link
Author

Yandychang1 commented Apr 25, 2022

.

@bigbugcc
Copy link
Contributor

I get. I can't solve this problem.
the author doesn't seem to provide relevant examples.

@jletria
Copy link

jletria commented May 24, 2022

Tensorflow.Keras.Engine.Model save method isn't working.
It's supposed to allow to save HDF5 files from Keras models (complete model, not just weights)

It calls ModelSaver which seems to be only half implemented, ending in a _build_meta_graph empty function.

Is there an alternative working method to save a Keras model to HDF5?
The save_weights method only saves weights, not the whole model.

@bigbugcc
Copy link
Contributor

I saw an example that uses another approach, by defining Saver Obj to save it, but this method doesn't work with keras API.
var saver = tf.train.Saver(tf.global_variables());
saver.save(sess, ModelPath);

@Yandychang1
Copy link
Author

Thank you for your answer.
From what I could gather, this saves a session in TensorFlow. The issue I found with this approach was to convert a Keras model into a session I can serialize.

@Oceania2018 Oceania2018 added the missing feature/s An issue about missing features in the library. label Nov 26, 2022
@Firestorm-253
Copy link

@Oceania2018 so there is no way to load anything into an untrained model?
where is the use to have a massive & complex ML-library if u can't save/load anything u trained for hours.
if that's actually the case, the whole project could be considered useless.

i like the work u did here, but that's quite depressing.

Greetings Fire

@Oceania2018
Copy link
Member

There is a approach you can load the saved model the pb format into tensorflow.net for prediction purpose only.
Check this example. It imports trained model.

@pantokrr
Copy link

pantokrr commented Jan 31, 2023

@Oceania2018 where is the use to have a massive & complex ML-library if u can't save/load anything u trained for hours. if that's actually the case, the whole project could be considered useless.

I absolutely agree. It makes it pointless to continue use of this library, if you cannot save a trained model.

And yes, it is possible to load a saved model, but that's not the focus point of this issue.

@AsakusaRinne
Copy link
Collaborator

We agree that the ability to save model is significant and I'm now working on it. Once I finish it, I'll tell you about it. :)

@AsakusaRinne
Copy link
Collaborator

The model saving of pb format has been partially finished and merged. #976 It supports saving trained model with keras and the model can be directly loaded with tensorflow python. For the usage please refer to example of alexnet saving, which is approximately same with tensorflow python API.

Welcome to have a try with this feature and share the BUGs and lacked features with us! I'll continue to complete this feature and add the model loading of pb format.

For details of this feature please refer to #976 (comment). The main incomplete parts are checkpoint and RNN.

@AsakusaRinne
Copy link
Collaborator

The model loading of SavedModel format in keras is also supported now. Here is an example to load Alexnet. Support for loading more complex models such as Bert is under development now by @Oceania2018 .

@individ2016
Copy link

individ2016 commented Mar 26, 2023

Still cant save model if using SciSharp.TensorFlow.Redist-Windows-GPU. Getting NotImplementedException at Tensorflow.CheckPointMultiDeviceSaver.save method.

TensorFlow.NET\src\TensorFlowNET.Core\Checkpoint\functional_saver.cs, ln 383

@AsakusaRinne
Copy link
Collaborator

Still cant save model if using SciSharp.TensorFlow.Redist-Windows-GPU. Getting NotImplementedException at Tensorflow.CheckPointMultiDeviceSaver.save method.

TensorFlow.NET\src\TensorFlowNET.Core\Checkpoint\functional_saver.cs, ln 383

Hi, could you please provide a minimal example to reproduce this exception. The model saving is not complete and still in rapid development. I'd like to work on your problem first :)

@Bender209
Copy link

When will the save be fixed? What is the point of this project if you cant save a model?

@individ2016
Copy link

individ2016 commented Apr 2, 2023

Still cant save model if using SciSharp.TensorFlow.Redist-Windows-GPU. Getting NotImplementedException at Tensorflow.CheckPointMultiDeviceSaver.save method.
TensorFlow.NET\src\TensorFlowNET.Core\Checkpoint\functional_saver.cs, ln 383

Hi, could you please provide a minimal example to reproduce this exception. The model saving is not complete and still in rapid development. I'd like to work on your problem first :)

Sorry for late response.. Code is here:

var layers = keras.layers;
// input layer
var inputs = keras.Input(shape: (28, 28, 1), name: "img");
var x = layers.Conv2D(32, (3, 3), padding: "same", activation: "relu").Apply(inputs);
x = layers.MaxPooling2D((2, 2), strides: (2, 2)).Apply(x);
x = layers.Conv2D(64, (3, 3), padding: "same", activation: "relu").Apply(x);
x = layers.MaxPooling2D((2, 2), strides: (2, 2)).Apply(x);
x = layers.Flatten().Apply(x);
x = layers.Dense(128, activation: "relu").Apply(x);
var outputs = layers.Dense(10, activation: "softmax").Apply(x);
model = keras.Model(inputs, outputs, name: "conv_net");
model.summary();
model.compile(optimizer: keras.optimizers.Adam(),
	loss: keras.losses.CategoricalCrossentropy(),
	metrics: new[] { "accuracy" });

// prepare dataset
var ((x_train, y_train), (x_test, y_test)) = keras.datasets.mnist.load_data();

// normalize the input
x_train = x_train / 255f;
var y_train_cat = np_utils.to_categorical(y_train, 10);
x_train = np.expand_dims(x_train, axis: 3);
model.fit(x_train, y_train_cat, batch_size: 32, epochs: 1, validation_split: 0.2f);
model.save("test_model");

@Oceania2018 Oceania2018 added this to the TensorFlow.NET v1.0 milestone Apr 5, 2023
@AsakusaRinne
Copy link
Collaborator

@individ2016 Could you please provide the version you used? I trued the code above and it works under both v0.100.2 and v0.100.4.

@Bender209
Copy link

Have you tried it with a GPU? Notice this code:
public Operation save(Tensor file_prefix, CheckpointOptions? options= null)
{
if(options is null)
{
options = new CheckpointOptions();
}

        ****_tf.device("CPU"); // may be risky._****
        var sharded_suffix = array_ops.where(gen_ops.regex_full_match(file_prefix, tf.constant(@"^s3://.*")),
            constant_op.constant(".part"), constant_op.constant("_temp/part"));
        var tmp_checkpoint_prefix = gen_ops.string_join(new Tensor[] { file_prefix, sharded_suffix });
        IDictionary<string, Tensor> registered_paths = _registered_savers.Keys.ToDictionary(x => x, x => registered_saver_filename(file_prefix, x));

@AsakusaRinne
Copy link
Collaborator

@Bender209 Yes, the code to specify the device is risky and will be fixed later. What's the problem you met when saving the model? I'll help to fix it.

@individ2016
Copy link

individ2016 commented Apr 6, 2023

@AsakusaRinne,

TensorFlow.NET v0.100.4
TensorFlow.Keras v0.10.4
SciSharp.TensorFlow.Redist-Windows-GPU v2.10.0

Yep, its on GPU. When i'm using CPU its all ok. Thats why now i train model, save weights, then switch to CPU, create model, load weights, then i can save full model))

@Bender209
Copy link

It appears we are both trying to save a model with GPU. We both are getting the same exception.
Based on the stack trace:
at Tensorflow.Checkpoint.MultiDeviceSaver.save(Tensor file_prefix, CheckpointOptions options)
at Tensorflow.Checkpoint.TrackableSaver.<>c__DisplayClass14_0.<save_cached_when_graph_building>b__0()
at Tensorflow.Checkpoint.TrackableSaver.save_cached_when_graph_building(String file_prefix, Tensor object_graph_tensor, CheckpointOptions options)
at Tensorflow.Checkpoint.TrackableSaver.save(String file_prefix, Nullable`1 checkpoint_number, Session session, CheckpointOptions options)
at Tensorflow.SavedModelUtils.save_and_return_nodes(Trackable obj, String export_dir, ConcreteFunction signatures, SaveOptions options, Boolean experimental_skip_checkpoint)
at Tensorflow.Keras.Saving.SavedModel.KerasSavedModelUtils.save_model(Model model, String filepath, Boolean overwrite, Boolean include_optimizer, ConcreteFunction signatures, SaveOptions options, Boolean save_traces)
at Tensorflow.Keras.Engine.Model.save(String filepath, Boolean overwrite, Boolean include_optimizer, String save_format, SaveOptions options, ConcreteFunction signatures, Boolean save_traces)

I think the cause is that it is pointing to the CPU device = _tf.device("CPU"); // may be risky.

Is this the case? To me it seems likely.

@Bender209
Copy link

@AsakusaRinne

@AsakusaRinne
Copy link
Collaborator

@AsakusaRinne

I've reappeared the error and am working on it. It's more than just changing the behavior of tf.device() so it may take some time. It is expected to be completed tomorrow or the day after tomorrow. 😊

@Bender209
Copy link

Thank you I am looking forward to it.

@AsakusaRinne
Copy link
Collaborator

Thank you I am looking forward to it.

It's faster than I expected. I've submitted a PR that resolves the error when saving model with GPU #1023. It will be merged after the review from @Oceania2018. Also, you are welcomed to fetch the branch and review if this PR resolves the problem well.

Since model saving and loading is a big feature of tensorflow, it's still not complete and is in rapid development. Please don't hesitate to tell us the if you met some problems when using it. Thank you for reporting us this BUG. :)

@AsakusaRinne
Copy link
Collaborator

Yep, its on GPU. When i'm using CPU its all ok. Thats why now i train model, save weights, then switch to CPU, create model, load weights, then i can save full model))

Thank you for telling us it. I guess #1023 can resolve your problem. 😊

@Bender209
Copy link

Thank you I am now able to save models.

@Wanglongzhi2001
Copy link
Collaborator

Wanglongzhi2001 commented Nov 13, 2023

Closing since TensorFlow.NET could save a trained model already, please reopen if you still have question, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
missing feature/s An issue about missing features in the library.
Projects
None yet
Development

No branches or pull requests

10 participants