truncate_seq_pair[source]

truncate_seq_pair(tokens_a, tokens_b, target, max_length, rng=None, is_seq=False)

punc_augument[source]

punc_augument(raw_inputs, params)

This code is dedicated in memory of a special time.

create_instances_from_document[source]

create_instances_from_document(all_documents, document_index, max_seq_length, short_seq_prob, masked_lm_prob, max_predictions_per_seq, vocab_words, rng)

Creates TrainingInstances for a single document.

create_masked_lm_predictions[source]

create_masked_lm_predictions(tokens, masked_lm_prob, max_predictions_per_seq, vocab_words, rng)

Creates the predictions for the masked LM objective.