RFC: Keras Pre-made Models #95

tanzhenyu · 2019-04-25T23:51:51Z

Feedback period will be open until 2019-05-10

Keras pre-made models

Status	Proposed
Author(s)	Zhenyu Tan (tanzheny@google.com)
Sponsor	Francois Chollet (fchollet@google.com), Alexandre Passos (apassos@google.com)
Updated	2019-04-29

Objective

This document proposes several pre-made Keras models that would allow users to:

build basic machine learning models easily
compose them with other keras Layers
replace Canned Estimators in TF 2.0.

seanpmorgan · 2019-04-26T11:14:14Z

20190425-keras-premade-models.md

+
+### Proposal 1: Customized training function & composability
+We propose to let each subclassed pre-made model override the training function. It is optional to provide a special subclass `CannedModel` if other methods such as `compile` and `fit` needs to be overriden as well. In traditional Keras models, such training function is dominated by autodiff - given the forward pass of the model, the gradients for each Operation is generated and backprop computation is automatically laid out for the entire model. However such assumption is only valid for neural network based supervised learning architecture. For many other scenarios, we would need to break this assumption:
+1. gradients may not be used, e.g., any un-supervised learning tasks


The subclass CannedModel seems like a half measure. If the training function is heavily modified it seems strange not to have an object type that describes that. Perhaps IrregularModel which would specify models that do not follow the typical Keras paradigm.

The canned DNN model is a CannedModel imo, but not an irregular model.

The type CannedModel also makes it sound as if users are not supposed to be building their own unique model types. To me canned means pre-made (which is what this RFC is about... but it should be extensible for users imo).

It's probably better not to have CannedModel for now, and let's just override train_function (or train_on_batch) -- WDYT?

Agree that's better than an optional subclass. I would probably still prefer a new Model type so that it's understood how different these are than typical Keras Models, but something like IrregularModel may just be more bloat to the API than its worth.

I'm keeping this as a design question to be addressed through design reviews.

AakashKumarNain · 2019-04-29T17:27:02Z

Because the RFC is about Pre-made models, I think this should also be considered in the RFC:
https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/QMCbkS6uZSM

unrahul · 2019-04-29T17:56:46Z

Could you give a clear example of what you mean by pre-made models, I am not sure if i clearly understand the difference between pre-made and extensible. Does this proposal mean, we can have estimator like models that are extensible for users to modify if needed otherwise use as is ?

terrytangyuan · 2019-04-29T17:59:45Z

20190425-keras-premade-models.md

+* relies on continuous graph rebuilding and checkpoints reloading which slows down training
+* relies on global collections and not TF 2.0 friendly.
+* makes many advanced features such as meta/transfer learning difficult
+* enforces user to create input functions when not necessary


Why not making premade estimators better from these perspectives?

We have two sets of high level APIs and Keras is the one that we choose to proceed because of its simplicity and pythonic implementation.

terrytangyuan · 2019-04-29T18:15:37Z

20190425-keras-premade-models.md

+### Challenges
+Many canned models, including BoostedTrees, KMeans (as well as WideDeep and many others more in the future) are highly complicated and do not follow the simple foward & backprop workflow, an assumption heavily relied on by Keras. Building such models, while not compromising the basic composability of Keras (i.e., compose layers on top of layers), is the main challenge we need to resolve in this document. Another challenge is that these models can have multiple training phases (such as collaborative filtering models). We propose the following two approaches to address them:
+
+### Proposal 1: Customized training function & composability


What about other phases like evaluation and prediction? What's the difference between this and tf.estimator's model_fn?

evaluation and prediction are specified in Mode in Estimator world. In Keras model will by default create train_function, evaluate_function and predict_function and use the corresponding one during fit/eval/predict.

tanzhenyu · 2019-04-30T00:26:49Z

Could you give a clear example of what you mean by pre-made models, I am not sure if i clearly understand the difference between pre-made and extensible. Does this proposal mean, we can have estimator like models that are extensible for users to modify if needed otherwise use as is ?

@unrahul I'm not sure why do you mean by estimator like models. But pre-made in this case is similar to keras-application models, you can compose them with any other layers and models. You can modify it by inheritance yes.

tanzhenyu · 2019-04-30T00:32:35Z

Because the RFC is about Pre-made models, I think this should also be considered in the RFC:
https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/QMCbkS6uZSM

@AakashKumarNain Not sure I understand the correlation -- from the discussion it looks like that is related to training/testing for variables. The models in this proposal do not have any variables in that property.

AakashKumarNain · 2019-04-30T06:16:41Z

@tanzhenyu Yeah I got it but then I think the name for RFC should have been something else. Anyways, thank you for clarification again

KesterTong · 2019-05-15T20:51:36Z

Can we add some discussion on integration with ProcessingLayers, which should also help clarify what the inputs to a CannedModel are?

@fchollet and I have brainstormed the following two options:

Option 1: Models accept a list of tensors as inputs, or pair of lists for DNNLinearModel, where the pair is the list of DNN and linear inputs respectively.

age = tf.keras.Input(shape=(1,), dtype=tf.float32, name='age')
bucketized_age = tf.keras.Discretize(age, bins=4)(age)
occupation = tf.keras.Input(shape=(None,), dtype=tf.string, name='occupation')
occupation_id = tf.keras.VocabLookup(vocabulary_list)(occupation)
occupation_embed = tf.keras.layers.Embedding(output_dim=10)(occupation_id)
processing_stage = tf.keras.ProcessingStage(
    inputs=[age, occupation], outputs=[bucketized_age, occupation_embed])
# for DNNLinearModel:
# processing_stage = tf.keras.ProcessingStage(
#     inputs=[age, occupation],
#     outputs=([bucketized_age, occupation_embed], [occupation_id]))

canned_model = tf.keras.canned.LinearClassifier()
# for DNNLinearModel:
# canned_model = tf.keras.canned.DNNLinearClassifier()
output = canned_model(processing_stage.outputs)
model = tf.keras.Model(inputs=processing_stage.inputs, outputs=[output])

dftrain = pd.read_csv('storage.googleapis.com/tf-datasets/titanic/train.csv')
y_train = dftrain.pop('survived')
ds = tf.data.Dataset.from_tensor_slices(dict(dftrain), y_train)
processing_stage.update(ds)
model.train(ds, epochs=10)

Option 2: Instead of accepting a list of tensors (or pair of lists for DNNLinearModel), models accept a single tensor (or pair of tensors for DNNLinearModel). The input represents the concatenated features.

age = tf.keras.Input(shape=(1,), dtype=tf.float32, name='age')
bucketized_age = tf.keras.Discretize(age, bins=4)(age)
occupation = tf.keras.Input(shape=(None,), dtype=tf.string, name='occupation')
occupation_id = tf.keras.VocabLookup(vocabulary_list)(occupation)
occupation_embed = tf.keras.layers.Embedding(output_dim=10)(occupation_id)
processing_stage = tf.keras.ProcessingStage(
    inputs=[age, occupation],
    outputs=tf.keras.layers.concatenate([bucketized_age, occupation_embed]))
# for DNNLinearModel:
# processing_stage = tf.keras.ProcessingStage(
#     inputs=[age, occupation],
#     outputs=(tf.keras.layers.concatenate([bucketized_age, occupation_embed]), occupation_id))

canned_model = tf.keras.canned.LinearClassifier()
# for DNNLinearModel:
# canned_model = tf.keras.canned.DNNLinearClassifier()
output = canned_model(processing_stage.outputs)
model = tf.keras.Model(inputs=processing_stage.inputs, outputs=[output])

dftrain = pd.read_csv('storage.googleapis.com/tf-datasets/titanic/train.csv')
y_train = dftrain.pop('survived')
ds = tf.data.Dataset.from_tensor_slices(dict(dftrain), y_train)
processing_stage.update(ds)
model.train(ds, epochs=10)

Create 20190425-keras-premade-models.md

dbda107

googlebot added the cla: yes label Apr 25, 2019

tanzhenyu changed the title ~~Create 20190425-keras-premade-models.md~~ RFC Keras Premade Models Apr 25, 2019

seanpmorgan reviewed Apr 26, 2019

View reviewed changes

ewilderj added the RFC: Proposed RFC Design Document label Apr 29, 2019

ewilderj changed the title ~~RFC Keras Premade Models~~ RFC: Keras Pre-made Models Apr 29, 2019

lgeiger mentioned this pull request Apr 29, 2019

Publish reference implementations and pretrained models larq/larq#76

Closed

terrytangyuan reviewed Apr 29, 2019

View reviewed changes

change email address

2d695a1

tanzhenyu requested review from ewilderj, goldiegadde and martinwicke as code owners April 29, 2019 23:48

Changes made by Francois.

6cab948

lgeiger mentioned this pull request May 10, 2019

Add pre-made/canned models larq/zoo#7

Open

mortendahl mentioned this pull request May 16, 2019

Pre-trained models tf-encrypted/tf-encrypted#482

Open

tanzhenyu added 2 commits June 10, 2019 13:14

Updates from design review.

90dd415

Mark RFC as reviewed.

ef1b6d5

ewilderj approved these changes Jun 10, 2019

View reviewed changes

ewilderj merged commit d9b5cfb into tensorflow:master Jun 10, 2019

ewilderj added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Oct 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Keras Pre-made Models #95

RFC: Keras Pre-made Models #95

tanzhenyu commented Apr 25, 2019 •

edited

Loading

seanpmorgan Apr 26, 2019 •

edited

Loading

seanpmorgan Apr 26, 2019 •

edited

Loading

tanzhenyu Apr 30, 2019

seanpmorgan Apr 30, 2019

tanzhenyu May 15, 2019

AakashKumarNain commented Apr 29, 2019

unrahul commented Apr 29, 2019

terrytangyuan Apr 29, 2019

tanzhenyu Apr 30, 2019

terrytangyuan Apr 29, 2019

tanzhenyu Apr 30, 2019

tanzhenyu commented Apr 30, 2019

tanzhenyu commented Apr 30, 2019

AakashKumarNain commented Apr 30, 2019

KesterTong commented May 15, 2019

RFC: Keras Pre-made Models #95

RFC: Keras Pre-made Models #95

Conversation

tanzhenyu commented Apr 25, 2019 • edited Loading

Keras pre-made models

Objective

seanpmorgan Apr 26, 2019 • edited Loading

Choose a reason for hiding this comment

seanpmorgan Apr 26, 2019 • edited Loading

Choose a reason for hiding this comment

tanzhenyu Apr 30, 2019

Choose a reason for hiding this comment

seanpmorgan Apr 30, 2019

Choose a reason for hiding this comment

tanzhenyu May 15, 2019

Choose a reason for hiding this comment

AakashKumarNain commented Apr 29, 2019

unrahul commented Apr 29, 2019

terrytangyuan Apr 29, 2019

Choose a reason for hiding this comment

tanzhenyu Apr 30, 2019

Choose a reason for hiding this comment

terrytangyuan Apr 29, 2019

Choose a reason for hiding this comment

tanzhenyu Apr 30, 2019

Choose a reason for hiding this comment

tanzhenyu commented Apr 30, 2019

tanzhenyu commented Apr 30, 2019

AakashKumarNain commented Apr 30, 2019

KesterTong commented May 15, 2019

tanzhenyu commented Apr 25, 2019 •

edited

Loading

seanpmorgan Apr 26, 2019 •

edited

Loading

seanpmorgan Apr 26, 2019 •

edited

Loading