Skip to content

Commit 8fe1748

Browse files
jsl-modelsahmedlone127maziyarpanahi
authored
2024-07-05-phi2_7b_en (#14339)
* Add model 2024-07-05-phi2_7b_en * Add model 2024-07-12-bart_large_cnn_en * Add model 2024-07-12-bart_large_cnn_en * Update 2024-07-05-phi2_7b_en.md --------- Co-authored-by: ahmedlone127 <[email protected]> Co-authored-by: Maziyar Panahi <[email protected]>
1 parent 04d9735 commit 8fe1748

File tree

2 files changed

+151
-0
lines changed

2 files changed

+151
-0
lines changed
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
layout: model
3+
title: Phi2 text-to-text model 7b int8
4+
author: John Snow Labs
5+
name: phi2
6+
date: 2024-07-05
7+
tags: [phi2, en, llm, open_source, openvino]
8+
task: Text Generation
9+
language: en
10+
edition: Spark NLP 5.4.0
11+
spark_version: 3.0
12+
supported: true
13+
engine: openvino
14+
annotator: Phi2Transformer
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained phi2 model , adapted and imported into Spark NLP.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phi2_en_5.4.0_3.0_1720187078320.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phi2_en_5.4.0_3.0_1720187078320.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol('text') \
40+
.setOutputCol('document')
41+
42+
phi2 = Phi2Transformer \
43+
.pretrained() \
44+
.setMaxOutputLength(50) \
45+
.setDoSample(False) \
46+
.setInputCols(["document"]) \
47+
.setOutputCol("phi2_generation")
48+
49+
pipeline = Pipeline().setStages([documentAssembler, phi2])
50+
data = spark.createDataFrame([["Who is the founder of Spark-NLP?"]]).toDF("text")
51+
pipelineModel = pipeline.fit(data)
52+
pipelineDF = pipelineModel.transform(data)
53+
54+
```
55+
```scala
56+
57+
val documentAssembler = new DocumentAssembler()
58+
.setInputCols("text")
59+
.setOutputCols("document")
60+
61+
val phi2 = Phi2Transformer .pretrained() .setMaxOutputLength(50) .setDoSample(False) .setInputCols(["document"]) .setOutputCol("phi2_generation")
62+
63+
val pipeline = new Pipeline().setStages(Array(documentAssembler, phi2))
64+
val data = Seq("Who is the founder of Spark-NLP?").toDS.toDF("text")
65+
val pipelineModel = pipeline.fit(data)
66+
val pipelineDF = pipelineModel.transform(data)
67+
```
68+
</div>
69+
70+
{:.model-param}
71+
## Model Information
72+
73+
{:.table-model}
74+
|---|---|
75+
|Model Name:|phi2|
76+
|Compatibility:|Spark NLP 5.4.0+|
77+
|License:|Open Source|
78+
|Edition:|Official|
79+
|Language:|en|
80+
|Size:|9.1 GB|
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
---
2+
layout: model
3+
title: BART (large-sized model), fine-tuned on CNN Daily Mail
4+
author: John Snow Labs
5+
name: bart_large_cnn
6+
date: 2024-07-12
7+
tags: [bart, bartsummarization, cnn, text_to_text, en, open_source, tensorflow]
8+
task: Summarization
9+
language: en
10+
edition: Spark NLP 5.4.0
11+
spark_version: 3.0
12+
supported: true
13+
engine: tensorflow
14+
annotator: BartTransformer
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
BART model pre-trained on English language, and fine-tuned on CNN Daily Mail. It was introduced in the paper BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension by Lewis et al. and first released in [this repository (https://github.com/pytorch/fairseq/tree/master/examples/bart).
23+
24+
Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.
25+
26+
Model description
27+
BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.
28+
29+
BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering). This particular checkpoint has been fine-tuned on CNN Daily Mail, a large collection of text-summary pairs
30+
31+
{:.btn-box}
32+
<button class="button button-orange" disabled>Live Demo</button>
33+
<button class="button button-orange" disabled>Open in Colab</button>
34+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bart_large_cnn_en_5.4.0_3.0_1720754758442.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
35+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bart_large_cnn_en_5.4.0_3.0_1720754758442.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
36+
37+
## How to use
38+
39+
40+
41+
<div class="tabs-box" markdown="1">
42+
{% include programmingLanguageSelectScalaPythonNLU.html %}
43+
```python
44+
45+
bart = BartTransformer.pretrained("bart_large_cnn") .setTask("summarize:") .setMaxOutputLength(200) .setInputCols(["documents"]) .setOutputCol("summaries")
46+
47+
```
48+
```scala
49+
50+
val bart = BartTransformer.pretrained("bart_large_cnn")
51+
.setTask("summarize:")
52+
.setMaxOutputLength(200)
53+
.setInputCols("documents")
54+
.setOutputCol("summaries")
55+
56+
```
57+
</div>
58+
59+
{:.model-param}
60+
## Model Information
61+
62+
{:.table-model}
63+
|---|---|
64+
|Model Name:|bart_large_cnn|
65+
|Compatibility:|Spark NLP 5.4.0+|
66+
|License:|Open Source|
67+
|Edition:|Official|
68+
|Input Labels:|[documents]|
69+
|Output Labels:|[generation]|
70+
|Language:|en|
71+
|Size:|974.9 MB|

0 commit comments

Comments
 (0)