-
Notifications
You must be signed in to change notification settings - Fork 579
RFC: Keras Pre-made Models #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Keras Pre-made Models #95
Conversation
20190425-keras-premade-models.md
Outdated
|
||
### Proposal 1: Customized training function & composability | ||
We propose to let each subclassed pre-made model override the training function. It is optional to provide a special subclass `CannedModel` if other methods such as `compile` and `fit` needs to be overriden as well. In traditional Keras models, such training function is dominated by autodiff - given the forward pass of the model, the gradients for each Operation is generated and backprop computation is automatically laid out for the entire model. However such assumption is only valid for neural network based supervised learning architecture. For many other scenarios, we would need to break this assumption: | ||
1. gradients may not be used, e.g., any un-supervised learning tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subclass CannedModel
seems like a half measure. If the training function is heavily modified it seems strange not to have an object type that describes that. Perhaps IrregularModel
which would specify models that do not follow the typical Keras paradigm.
The canned DNN model is a CannedModel
imo, but not an irregular model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type CannedModel
also makes it sound as if users are not supposed to be building their own unique model types. To me canned means pre-made (which is what this RFC is about... but it should be extensible for users imo).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably better not to have CannedModel
for now, and let's just override train_function (or train_on_batch) -- WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that's better than an optional subclass. I would probably still prefer a new Model type so that it's understood how different these are than typical Keras Models, but something like IrregularModel
may just be more bloat to the API than its worth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm keeping this as a design question to be addressed through design reviews.
Because the RFC is about |
Could you give a clear example of what you mean by pre-made models, I am not sure if i clearly understand the difference between pre-made and extensible. Does this proposal mean, we can have |
20190425-keras-premade-models.md
Outdated
* relies on continuous graph rebuilding and checkpoints reloading which slows down training | ||
* relies on global collections and not TF 2.0 friendly. | ||
* makes many advanced features such as meta/transfer learning difficult | ||
* enforces user to create input functions when not necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not making premade estimators better from these perspectives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have two sets of high level APIs and Keras is the one that we choose to proceed because of its simplicity and pythonic implementation.
20190425-keras-premade-models.md
Outdated
### Challenges | ||
Many canned models, including BoostedTrees, KMeans (as well as WideDeep and many others more in the future) are highly complicated and do not follow the simple foward & backprop workflow, an assumption heavily relied on by Keras. Building such models, while not compromising the basic composability of Keras (i.e., compose layers on top of layers), is the main challenge we need to resolve in this document. Another challenge is that these models can have multiple training phases (such as collaborative filtering models). We propose the following two approaches to address them: | ||
|
||
### Proposal 1: Customized training function & composability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other phases like evaluation and prediction? What's the difference between this and tf.estimator
's model_fn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evaluation and prediction are specified in Mode in Estimator world. In Keras model will by default create train_function, evaluate_function and predict_function and use the corresponding one during fit/eval/predict.
@unrahul I'm not sure why do you mean by |
@AakashKumarNain Not sure I understand the correlation -- from the discussion it looks like that is related to training/testing for variables. The models in this proposal do not have any variables in that property. |
@tanzhenyu Yeah I got it but then I think the name for RFC should have been something else. Anyways, thank you for clarification again |
Can we add some discussion on integration with ProcessingLayers, which should also help clarify what the inputs to a CannedModel are? @fchollet and I have brainstormed the following two options: Option 1: Models accept a list of tensors as inputs, or pair of lists for DNNLinearModel, where the pair is the list of DNN and linear inputs respectively.
Option 2: Instead of accepting a list of tensors (or pair of lists for DNNLinearModel), models accept a single tensor (or pair of tensors for DNNLinearModel). The input represents the concatenated features.
|
Feedback period will be open until 2019-05-10
Keras pre-made models
Objective
This document proposes several pre-made Keras models that would allow users to: