Imports

Test setup

Init model in various mode

train: model will be loaded from huggingface resume: model will be loaded from params.ckpt_dir, if params.ckpt_dir dose not contain valid checkpoint, then load from huggingface transfer: model will be loaded from params.init_checkpoint, the correspongding path should contain checkpoints saved using m3tl predict: model will be loaded from params.ckpt_dir except optimizers' states eval: model will be loaded from params.ckpt_dir except optimizers' states, model will be compiled

Args:

mirrored_strategy (tf.distribute.MirroredStrategy): mirrored strategy
params (Params): params
mode (str, optional): Mode, see above explaination. Defaults to 'train'.
inputs_to_build_model (Dict, optional): A batch of data. Defaults to None.
model (Model, optional): Keras model. Defaults to None.

Returns:

model: loaded model

Train model

Train Multi-task Bert model

Keyword Arguments:

problem (str, optional) -- Problems to train. Defaults to 'weibo_ner'
num_gpus (int, optional) -- Number of GPU to use. Defaults to 1
num_epochs (int, optional) -- Number of epochs to train. Defaults to 10
model_dir (str, optional) -- model dir. Defaults to ''
params (Params, optional) -- Params to define training and models. Defaults to None
problem_type_dict (dict, optional) -- Key: problem name, value: problem type. Defaults to None
processing_fn_dict (dict, optional) -- Key: problem name, value: problem data preprocessing fn. Defaults to None
model (tf.keras.Model, optional): if not provided, it will be created using create_keras_model. Defaults to None.
create_tf_record_only (bool, optional): if True, the function will only create TFRecord without training model. Defaults to False.
steps_per_epoch (int, optional): steps per epochs, if not provided, train datset will be looped once to calculate steps per epoch. Defaults to None.
warmup_ratio (float, optional): lr warmup ratio. Defaults to 0.1.
continue_training (bool, optional): whether to resume training from model_dir. Defaults to False.
mirrored_strategy (MirroredStrategy, optional): Tensorflow MirroredStrategy. Defaults to None.
run_eagerly (bool, optional): Whether to run model eagerly. Defaults to False.

params.use_horovod = False

model = train_bert_multitask(
    problem=problem,
    num_epochs=1,
    params=params,
    problem_type_dict=problem_type_dict,
    processing_fn_dict=processing_fn_dict,
    steps_per_epoch=1,
    continue_training=True,
    mirrored_strategy=False,
    run_eagerly=True
)

2021-06-17 13:22:41.927 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_ner, problem type: seq_tag
2021-06-17 13:22:41.928 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_cws, problem type: seq_tag
2021-06-17 13:22:41.928 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_multi_cls, problem type: multi_cls
2021-06-17 13:22:41.929 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_cls, problem type: cls
2021-06-17 13:22:41.929 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_masklm, problem type: masklm
2021-06-17 13:22:41.930 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_pretrain, problem type: pretrain
2021-06-17 13:22:41.930 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_premask_mlm, problem type: premask_mlm
2021-06-17 13:22:41.931 | WARNING  | m3tl.base_params:prepare_dir:361 - bert_config not exists. will load model from huggingface checkpoint.
2021-06-17 13:22:42.005 | WARNING  | m3tl.read_write_tfrecord:chain_processed_data:250 - Chaining problems with & may consume a lot of memory if data is not pyspark RDD.
2021-06-17 13:22:42.012 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_fake_cls_weibo_fake_ner/train_00000.tfrecord
2021-06-17 13:22:42.043 | WARNING  | m3tl.read_write_tfrecord:chain_processed_data:250 - Chaining problems with & may consume a lot of memory if data is not pyspark RDD.
2021-06-17 13:22:42.050 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_fake_cls_weibo_fake_ner/eval_00000.tfrecord
2021-06-17 13:22:42.075 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_fake_multi_cls/train_00000.tfrecord
2021-06-17 13:22:42.101 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_fake_multi_cls/eval_00000.tfrecord
2021-06-17 13:22:42.176 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_masklm/train_00000.tfrecord
2021-06-17 13:22:42.224 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_masklm/eval_00000.tfrecord
2021-06-17 13:22:42.288 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_premask_mlm/train_00000.tfrecord
2021-06-17 13:22:42.352 | DEBUG    | m3tl.read_write_tfrecord:_write_fn:134 - Writing /tmp/tmp4ngci4nl/weibo_premask_mlm/eval_00000.tfrecord
2021-06-17 13:22:43.501 | INFO     | m3tl.input_fn:train_eval_input_fn:59 - sampling weights: 
2021-06-17 13:22:43.501 | INFO     | m3tl.input_fn:train_eval_input_fn:60 - {
    "weibo_fake_cls_weibo_fake_ner": 0.2702702702702703,
    "weibo_fake_multi_cls": 0.2702702702702703,
    "weibo_masklm": 0.1891891891891892,
    "weibo_premask_mlm": 0.2702702702702703
}
2021-06-17 13:22:44.126 | INFO     | m3tl.input_fn:train_eval_input_fn:59 - sampling weights: 
2021-06-17 13:22:44.127 | INFO     | m3tl.input_fn:train_eval_input_fn:60 - {
    "weibo_fake_cls_weibo_fake_ner": 0.2702702702702703,
    "weibo_fake_multi_cls": 0.2702702702702703,
    "weibo_masklm": 0.1891891891891892,
    "weibo_premask_mlm": 0.2702702702702703
}
2021-06-17 13:22:44.190 | CRITICAL | m3tl.base_params:update_train_steps:454 - Updating train_steps to 1
2021-06-17 13:22:44.874 | INFO     | m3tl.input_fn:train_eval_input_fn:59 - sampling weights: 
2021-06-17 13:22:44.875 | INFO     | m3tl.input_fn:train_eval_input_fn:60 - {
    "weibo_fake_cls_weibo_fake_ner": 0.2702702702702703,
    "weibo_fake_multi_cls": 0.2702702702702703,
    "weibo_masklm": 0.1891891891891892,
    "weibo_premask_mlm": 0.2702702702702703
}
404 Client Error: Not Found for url: https://huggingface.co/voidful/albert_chinese_tiny/resolve/main/tf_model.h5
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFAlbertModel: ['predictions.dense.bias', 'predictions.decoder.weight', 'predictions.LayerNorm.weight', 'predictions.bias', 'predictions.dense.weight', 'predictions.LayerNorm.bias', 'predictions.decoder.bias']
- This IS expected if you are initializing TFAlbertModel from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFAlbertModel from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFAlbertModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFAlbertModel for predictions without further training.
2021-06-17 13:22:49.859 | CRITICAL | m3tl.embedding_layer.base:__init__:58 - Modal Type id mapping: 
 {
    "array": 0,
    "cate": 1,
    "text": 2
}
2021-06-17 13:22:49.951 | WARNING  | m3tl.problem_types.masklm:__init__:41 - Share embedding is enabled but hidden_size != embedding_size
2021-06-17 13:22:49.983 | INFO     | m3tl.utils:set_phase:478 - Setting phase to infer
2021-06-17 13:22:49.991 | CRITICAL | m3tl.model_fn:compile:271 - Initial lr: 0.0
2021-06-17 13:22:49.991 | CRITICAL | m3tl.model_fn:compile:272 - Train steps: 1
2021-06-17 13:22:49.992 | CRITICAL | m3tl.model_fn:compile:273 - Warmup steps: 0
2021-06-17 13:22:50.941 | INFO     | m3tl.utils:set_phase:478 - Setting phase to train

WARNING: AutoGraph could not transform <bound method Socket.send of <zmq.sugar.socket.Socket object at 0x7fd82e8549f0>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert

The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

1/1 [==============================] - ETA: 0s - mean_acc: 3.6298 - weibo_fake_cls_acc: 0.5000 - weibo_fake_ner_acc: 0.1250 - BertMultiTaskTop/weibo_fake_cls/losses/0: 0.8964 - BertMultiTaskTop/weibo_fake_multi_cls/losses/0: 0.9989 - BertMultiTaskTop/weibo_fake_ner/losses/0: 2.8322 - BertMultiTaskTop/weibo_masklm/losses/0: 10.0389 - BertMultiTaskTop/weibo_premask_mlm/losses/0: 10.0175

2021-06-17 13:23:00.005 | INFO     | m3tl.utils:set_phase:478 - Setting phase to eval
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
2021-06-17 13:23:00.947 | INFO     | m3tl.utils:set_phase:478 - Setting phase to eval
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 1000 batches). You may need to use the repeat() function when building your dataset.
1/1 [==============================] - 12s 12s/step - mean_acc: 3.6298 - weibo_fake_cls_acc: 0.5000 - weibo_fake_ner_acc: 0.1250 - BertMultiTaskTop/weibo_fake_cls/losses/0: 0.8964 - BertMultiTaskTop/weibo_fake_multi_cls/losses/0: 0.9989 - BertMultiTaskTop/weibo_fake_ner/losses/0: 2.8322 - BertMultiTaskTop/weibo_masklm/losses/0: 10.0389 - BertMultiTaskTop/weibo_premask_mlm/losses/0: 10.0175 - val_loss: 14.9478 - val_mean_acc: 0.3333 - val_weibo_fake_cls_acc: 0.5000 - val_weibo_fake_ner_acc: 0.0000e+00 - val_BertMultiTaskTop/weibo_fake_cls/losses/0: 1.0952 - val_BertMultiTaskTop/weibo_fake_multi_cls/losses/0: 1.1085 - val_BertMultiTaskTop/weibo_fake_ner/losses/0: 3.0644 - val_BertMultiTaskTop/weibo_masklm/losses/0: 0.0000e+00
Model: "BertMultiTask"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
BertMultiTaskBody (BertMulti multiple                  4082696   
_________________________________________________________________
basic_mtl (BasicMTL)         multiple                  0         
_________________________________________________________________
BertMultiTaskTop (BertMultiT multiple                  13229575  
_________________________________________________________________
sum_loss_combination (SumLos multiple                  0         
=================================================================
Total params: 17,312,273
Trainable params: 17,312,267
Non-trainable params: 6
_________________________________________________________________

Slow train test

Saving model for prediction

test_tup = ({
    'text_input_ids': [None, 3],
    'text_mask': [None, 3],
    'text_segment_ids': [None, 3],
    'image_input_ids': [None, 5, 10],
    'image_mask': [None, 5],
    'image_segment_ids': [None, 5],
    'class_input_ids': [None, 1],
    'class_mask': [None, 1],
    'class_segment_ids': [None, 1]},
    {
    'text_input_ids': tf.int32,
    'text_mask': tf.int32,
    'text_segment_ids': tf.int32,
    'image_input_ids': tf.float32,
    'image_mask': tf.int32,
    'image_segment_ids': tf.int32,
    'class_input_ids': tf.int32,
    'class_mask': tf.int32,
    'class_segment_ids': tf.int32})


print(create_tensorspec_from_shape_type(test_tup))

{'text_input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_input_ids'), 'text_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_mask'), 'text_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_segment_ids'), 'image_input_ids': TensorSpec(shape=(None, None, None), dtype=tf.float32, name='image_input_ids'), 'image_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='image_mask'), 'image_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='image_segment_ids'), 'class_input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='class_input_ids'), 'class_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='class_mask'), 'class_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='class_segment_ids')}

Minimize checkpoint size for prediction.

Since the original checkpoint contains optimizer's variable, for instance, if the use adam, the checkpoint size will be three times of the size of model weights. This function will remove those unused variables in prediction to save space.

Note: if the model is a multimodal model, you have to provide fake_input_list that mimic the structure of real input. Otherwise modal embeddings will be randomly initialized.

Args:

problem (str): problem
input_dir (str): input dir
output_dir (str): output dir
problem_type_dict (Dict[str, str], optional): problem type dict. Defaults to None.
fake_input_list (List, optional): fake input list to create dummy dataset
params (Params, optional): params

tf.get_logger().setLevel('ERROR')
import numpy as np
from m3tl.predefined_problems.test_data import generate_fake_data
fake_inputs = [v for v, _ in generate_fake_data(output_format='gen_dict_tuple')]

# save as SavedModel pb
trim_checkpoint_for_prediction(problem=model.params.problem_str, input_dir=model.params.ckpt_dir,
    output_dir=model.params.ckpt_dir+'_pred',
    problem_type_dict=problem_type_dict, overwrite=True, fake_input_list=fake_inputs, save_weights_only=False)

trim_checkpoint_for_prediction(
    problem=problem, input_dir=model.params.ckpt_dir,
    output_dir=model.params.ckpt_dir+'_pred',
    problem_type_dict=problem_type_dict, overwrite=True, fake_input_list=fake_inputs)

2021-06-17 13:24:55.159 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_ner, problem type: seq_tag
2021-06-17 13:24:55.159 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_cws, problem type: seq_tag
2021-06-17 13:24:55.160 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_multi_cls, problem type: multi_cls
2021-06-17 13:24:55.160 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_cls, problem type: cls
2021-06-17 13:24:55.161 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_masklm, problem type: masklm
2021-06-17 13:24:55.161 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_pretrain, problem type: pretrain
2021-06-17 13:24:55.164 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_premask_mlm, problem type: premask_mlm
2021-06-17 13:24:55.165 | WARNING  | m3tl.base_params:assign_problem:634 - base_dir and dir_name arguments will be deprecated in the future. Please use model_dir instead.
2021-06-17 13:24:55.383 | CRITICAL | m3tl.embedding_layer.base:__init__:58 - Modal Type id mapping: 
 {
    "array": 0,
    "cate": 1,
    "text": 2
}
2021-06-17 13:24:55.476 | WARNING  | m3tl.problem_types.masklm:__init__:41 - Share embedding is enabled but hidden_size != embedding_size
2021-06-17 13:24:55.615 | INFO     | m3tl.utils:set_phase:478 - Setting phase to infer
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
2021-06-17 13:24:57.074 | CRITICAL | __main__:trim_checkpoint_for_prediction:43 - serving input sigantures: {'text_input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_input_ids'), 'text_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_mask'), 'text_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_segment_ids'), 'array_input_ids': TensorSpec(shape=(None, None, None), dtype=tf.float32, name='array_input_ids'), 'array_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='array_mask'), 'array_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='array_segment_ids'), 'cate_input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='cate_input_ids'), 'cate_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='cate_mask'), 'cate_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='cate_segment_ids')}
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, position_embeddings_layer_call_fn, position_embeddings_layer_call_and_return_conditional_losses, token_type_embeddings_layer_call_fn while saving (showing 5 of 125). These functions will not be directly callable after loading.
WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, position_embeddings_layer_call_fn, position_embeddings_layer_call_and_return_conditional_losses, token_type_embeddings_layer_call_fn while saving (showing 5 of 125). These functions will not be directly callable after loading.
2021-06-17 13:25:39.080 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_ner, problem type: seq_tag
2021-06-17 13:25:39.080 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_cws, problem type: seq_tag
2021-06-17 13:25:39.081 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_multi_cls, problem type: multi_cls
2021-06-17 13:25:39.082 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_cls, problem type: cls
2021-06-17 13:25:39.082 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_masklm, problem type: masklm
2021-06-17 13:25:39.082 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_pretrain, problem type: pretrain
2021-06-17 13:25:39.083 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_premask_mlm, problem type: premask_mlm
2021-06-17 13:25:39.084 | WARNING  | m3tl.base_params:assign_problem:634 - base_dir and dir_name arguments will be deprecated in the future. Please use model_dir instead.
2021-06-17 13:25:39.306 | CRITICAL | m3tl.embedding_layer.base:__init__:58 - Modal Type id mapping: 
 {
    "array": 0,
    "cate": 1,
    "text": 2
}
2021-06-17 13:25:39.404 | WARNING  | m3tl.problem_types.masklm:__init__:41 - Share embedding is enabled but hidden_size != embedding_size
2021-06-17 13:25:39.558 | INFO     | m3tl.utils:set_phase:478 - Setting phase to infer
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
2021-06-17 13:25:40.708 | CRITICAL | __main__:trim_checkpoint_for_prediction:43 - serving input sigantures: {'text_input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_input_ids'), 'text_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_mask'), 'text_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='text_segment_ids'), 'array_input_ids': TensorSpec(shape=(None, None, None), dtype=tf.float32, name='array_input_ids'), 'array_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='array_mask'), 'array_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='array_segment_ids'), 'cate_input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='cate_input_ids'), 'cate_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='cate_mask'), 'cate_segment_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='cate_segment_ids')}

Eval

Evaluate Multi-task Bert model

Keyword Arguments:

problem (str, optional): problems to evaluate. Defaults to 'weibo_ner'.
num_gpus (int, optional): number of gpu to use. Defaults to 1.
model_dir (str, optional): model dir. Defaults to ''.
params (Params, optional): params. Defaults to None.
problem_type_dict (dict, optional): Key: problem name, value: problem type. Defaults to None.
processing_fn_dict (dict, optional): Key: problem name, value: problem data preprocessing fn. Defaults to None.
model (tf.keras.Model, optional): If not provided, it will be created with create_keras_model. Defaults to None.
run_eagerly (bool, optional): Whether to run model eagerly. Defaults to False.

import shutil
shutil.rmtree(model.params.ckpt_dir)

eval_bert_multitask(problem=problem, params=params,
                    problem_type_dict=problem_type_dict, processing_fn_dict=processing_fn_dict,
                    model_dir=model.params.ckpt_dir+'_pred')

# provide model instead of dir
eval_bert_multitask(problem=problem, params=params,
                    problem_type_dict=problem_type_dict, processing_fn_dict=processing_fn_dict,
                    model=model)

2021-06-15 20:28:35.752 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_ner, problem type: seq_tag
2021-06-15 20:28:35.753 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_cws, problem type: seq_tag
2021-06-15 20:28:35.754 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_multi_cls, problem type: multi_cls
2021-06-15 20:28:35.754 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_cls, problem type: cls
2021-06-15 20:28:35.755 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_masklm, problem type: masklm
2021-06-15 20:28:35.755 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_pretrain, problem type: pretrain
2021-06-15 20:28:35.755 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_premask_mlm, problem type: premask_mlm
2021-06-15 20:28:36.219 | INFO     | m3tl.input_fn:train_eval_input_fn:59 - sampling weights: 
2021-06-15 20:28:36.220 | INFO     | m3tl.input_fn:train_eval_input_fn:60 - {
    "weibo_fake_cls_weibo_fake_ner": 0.2564102564102564,
    "weibo_fake_multi_cls": 0.2564102564102564,
    "weibo_masklm": 0.23076923076923078,
    "weibo_premask_mlm": 0.2564102564102564
}
2021-06-15 20:28:37.241 | INFO     | m3tl.input_fn:train_eval_input_fn:59 - sampling weights: 
2021-06-15 20:28:37.243 | INFO     | m3tl.input_fn:train_eval_input_fn:60 - {
    "weibo_fake_cls_weibo_fake_ner": 0.2564102564102564,
    "weibo_fake_multi_cls": 0.2564102564102564,
    "weibo_masklm": 0.23076923076923078,
    "weibo_premask_mlm": 0.2564102564102564
}
2021-06-15 20:28:37.597 | CRITICAL | m3tl.embedding_layer.base:__init__:58 - Modal Type id mapping: 
 {
    "array": 0,
    "cate": 1,
    "text": 2
}
2021-06-15 20:28:37.691 | WARNING  | m3tl.problem_types.masklm:__init__:41 - Share embedding is enabled but hidden_size != embedding_size
2021-06-15 20:28:37.723 | INFO     | m3tl.utils:set_phase:478 - Setting phase to infer
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
2021-06-15 20:28:39.219 | CRITICAL | m3tl.model_fn:compile:271 - Initial lr: 0.0
2021-06-15 20:28:39.220 | CRITICAL | m3tl.model_fn:compile:272 - Train steps: 1
2021-06-15 20:28:39.220 | CRITICAL | m3tl.model_fn:compile:273 - Warmup steps: 0
2021-06-15 20:28:40.317 | INFO     | m3tl.utils:set_phase:478 - Setting phase to eval
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

3/3 [==============================] - 5s 21ms/step - loss: 22.2266 - mean_acc: 0.5357 - weibo_fake_cls_acc: 0.5000 - weibo_fake_ner_acc: 0.5714 - BertMultiTaskTop/weibo_fake_cls/losses/0: 0.2471 - BertMultiTaskTop/weibo_fake_multi_cls/losses/0: 0.5657 - BertMultiTaskTop/weibo_fake_ner/losses/0: 0.3361 - BertMultiTaskTop/weibo_masklm/losses/0: 10.0024 - BertMultiTaskTop/weibo_premask_mlm/losses/0: 9.9237

2021-06-15 20:28:44.361 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_ner, problem type: seq_tag
2021-06-15 20:28:44.362 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_cws, problem type: seq_tag
2021-06-15 20:28:44.363 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_multi_cls, problem type: multi_cls
2021-06-15 20:28:44.363 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_cls, problem type: cls
2021-06-15 20:28:44.363 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_masklm, problem type: masklm
2021-06-15 20:28:44.364 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_pretrain, problem type: pretrain
2021-06-15 20:28:44.364 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_premask_mlm, problem type: premask_mlm
2021-06-15 20:28:44.805 | INFO     | m3tl.input_fn:train_eval_input_fn:59 - sampling weights: 
2021-06-15 20:28:44.806 | INFO     | m3tl.input_fn:train_eval_input_fn:60 - {
    "weibo_fake_cls_weibo_fake_ner": 0.2564102564102564,
    "weibo_fake_multi_cls": 0.2564102564102564,
    "weibo_masklm": 0.23076923076923078,
    "weibo_premask_mlm": 0.2564102564102564
}
2021-06-15 20:28:45.832 | INFO     | m3tl.input_fn:train_eval_input_fn:59 - sampling weights: 
2021-06-15 20:28:45.833 | INFO     | m3tl.input_fn:train_eval_input_fn:60 - {
    "weibo_fake_cls_weibo_fake_ner": 0.2564102564102564,
    "weibo_fake_multi_cls": 0.2564102564102564,
    "weibo_masklm": 0.23076923076923078,
    "weibo_premask_mlm": 0.2564102564102564
}

3/3 [==============================] - 1s 21ms/step - loss: 23.1741 - mean_acc: 0.4241 - weibo_fake_cls_acc: 0.5000 - weibo_fake_ner_acc: 0.2857 - BertMultiTaskTop/weibo_fake_cls/losses/0: 1.4929 - BertMultiTaskTop/weibo_fake_multi_cls/losses/0: 0.2098 - BertMultiTaskTop/weibo_fake_ner/losses/0: 1.7419 - BertMultiTaskTop/weibo_masklm/losses/0: 10.0072 - BertMultiTaskTop/weibo_premask_mlm/losses/0: 9.7933

{'loss': 23.17413330078125,
 'mean_acc': 0.4241071343421936,
 'weibo_fake_cls_acc': 0.5,
 'weibo_fake_ner_acc': 0.2857142984867096,
 'BertMultiTaskTop/weibo_fake_cls/losses/0': 1.8408839702606201,
 'BertMultiTaskTop/weibo_fake_multi_cls/losses/0': 0.0,
 'BertMultiTaskTop/weibo_fake_ner/losses/0': 1.7550007104873657,
 'BertMultiTaskTop/weibo_masklm/losses/0': 10.002845764160156,
 'BertMultiTaskTop/weibo_premask_mlm/losses/0': 9.788535118103027}

Predict

def arr_to_str(inp_arr: np.ndarray) -> str:
    l = inp_arr.tolist()
    l = [json.dumps(f) for f in l]
    return l


def decode_predictions(pred: Dict[str, np.ndarray], params: Params, array_as_str=False) -> Dict[str, Union[int, float, np.ndarray, list, str]]:
    parsed_pred = dict()
    problem_list = params.problem_list
    label_encoder_dict = {p: get_or_make_label_encoder(
        params=params, problem=p, mode=PREDICT) for p in problem_list}
    for problem, problem_pred_array in pred.items():

        # addtional outputs
        if problem not in problem_list:
            if isinstance(problem_pred_array, np.ndarray):
                if array_as_str:
                    parsed_pred[problem] = arr_to_str(problem_pred_array)
                else:
                    parsed_pred[problem] = problem_pred_array
            else:
                parsed_pred[problem] = problem_pred_array
            continue

        label_encoder = label_encoder_dict[problem]

        support_problem_type = [
            'multi_cls',
            'cls',
            'seq_tag',
            'regression',
            'masklm',
            'premask_mlm',
            'vectorfit'
        ]

        problem_type = params.get_problem_type(problem=problem)
        if problem_type not in support_problem_type:
            logger.warning("trying to decode prediction of unsupported problem type"
            " {}, if any error raised, please disable decode prediction.".format(problem_type))

        is_multi_cls = params.get_problem_type(problem=problem) == 'multi_cls'
        is_cls = params.get_problem_type(problem=problem) == 'cls'
        is_seq_tag = params.get_problem_type(problem=problem) == 'seq_tag'
        is_regression = params.get_problem_type(
            problem=problem) == 'regression'

        if is_regression:
            parsed_pred[problem] = problem_pred_array
            continue

        # get pred from prob
        if is_multi_cls:
            problem_pred = problem_pred_array >= 0.5
        elif is_cls or is_seq_tag:
            problem_pred = np.argmax(problem_pred_array, axis=-1)
            # problem_pred = problem_pred_array
        else:
            problem_pred = problem_pred_array

        # sequence labels
        if is_seq_tag:
            parsed_problem_pred = np.apply_along_axis(
                label_encoder.inverse_transform, axis=1, arr=problem_pred)
        else:
            if isinstance(label_encoder, MultiLabelBinarizer) or isinstance(label_encoder, LabelEncoder):
                parsed_problem_pred = label_encoder.inverse_transform(
                    problem_pred)
            elif isinstance(label_encoder, PreTrainedTokenizer):
                parsed_problem_pred = np.apply_along_axis(
                    label_encoder.convert_ids_to_tokens, axis=1, arr=problem_pred
                )
            else:
                parsed_problem_pred = problem_pred_array

        parsed_pred[problem] = parsed_problem_pred
    return parsed_pred

pred, model = predict_bert_multitask(
    problem='weibo_fake_ner',
    inputs=fake_inputs*20, model_dir=model.params.ckpt_dir,
    problem_type_dict=problem_type_dict,
    processing_fn_dict=processing_fn_dict, return_model=True,
    params=params)

2021-06-15 20:28:53.975 | INFO     | m3tl.utils:set_phase:478 - Setting phase to infer
2021-06-15 20:28:53.976 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_ner, problem type: seq_tag
2021-06-15 20:28:53.977 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_cws, problem type: seq_tag
2021-06-15 20:28:53.977 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_multi_cls, problem type: multi_cls
2021-06-15 20:28:53.978 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_fake_cls, problem type: cls
2021-06-15 20:28:53.978 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_masklm, problem type: masklm
2021-06-15 20:28:53.978 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_pretrain, problem type: pretrain
2021-06-15 20:28:53.979 | INFO     | m3tl.base_params:register_multiple_problems:538 - Adding new problem weibo_premask_mlm, problem type: premask_mlm
2021-06-15 20:28:54.017 | INFO     | __main__:predict_bert_multitask:39 - Checkpoint dir: models/weibo_fake_cls_weibo_fake_multi_cls_weibo_fake_ner_weibo_masklm_weibo_premask_mlm_ckpt_pred
2021-06-15 20:28:59.407 | CRITICAL | m3tl.embedding_layer.base:__init__:58 - Modal Type id mapping: 
 {
    "array": 0,
    "cate": 1,
    "text": 2
}
2021-06-15 20:28:59.474 | INFO     | m3tl.utils:set_phase:478 - Setting phase to infer
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
2021-06-15 20:29:00.488 | INFO     | m3tl.utils:set_phase:478 - Setting phase to infer
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.

Run Bert Multitask Learning

Imports

Test setup

`create_keras_model`[source]

Train model

`get_params_ready`[source]

`train_bert_multitask`[source]

Slow train test

Saving model for prediction

`create_tensorspec_from_shape_type`[source]

`trim_checkpoint_for_prediction`[source]

Eval

`eval_bert_multitask`[source]

Predict

`predict_bert_multitask`[source]

Run Bert Multitask Learning

Imports

Test setup

create_keras_model[source]

Train model

get_params_ready[source]

train_bert_multitask[source]

Slow train test

Saving model for prediction

create_tensorspec_from_shape_type[source]

trim_checkpoint_for_prediction[source]

Eval

eval_bert_multitask[source]

Predict

predict_bert_multitask[source]

`create_keras_model`[source]

`get_params_ready`[source]

`train_bert_multitask`[source]

`create_tensorspec_from_shape_type`[source]

`trim_checkpoint_for_prediction`[source]

`eval_bert_multitask`[source]

`predict_bert_multitask`[source]