Skip to content

Commit 12f61ed

Browse files
committed
refine, documentation, sample and structuring
- add the java docs. - refine the classes style. - update the sample project to call text2speech.
1 parent 9bff6ef commit 12f61ed

File tree

18 files changed

+385
-47
lines changed

18 files changed

+385
-47
lines changed

README.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Intelligent Java
2-
[![Maven Central](https://img.shields.io/maven-central/v/io.github.barqawiz/intellijava.core?style=for-the-badge)](https://central.sonatype.com/artifact/io.github.barqawiz/intellijava.core/0.6.2)
2+
[![Maven Central](https://img.shields.io/maven-central/v/io.github.barqawiz/intellijava.core?style=for-the-badge)](https://central.sonatype.com/artifact/io.github.barqawiz/intellijava.core/0.7.0)
33
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/Barqawiz/IntelliJava?style=for-the-badge)](https://github.com/Barqawiz/IntelliJava/releases)
44
[![GitHub](https://img.shields.io/github/license/Barqawiz/IntelliJava?style=for-the-badge)](https://opensource.org/licenses/Apache-2.0)
55

@@ -9,12 +9,13 @@ Intelligent java (IntelliJava) is the ultimate tool for Java developers looking
99
The supported models:
1010
- **OpenAI**: Access GPT-3 to generate text and DALL·E to generate images. OpenAI is preferred when you want quality results without tuning.
1111
- **Cohere.ai**: Generate text; Cohere allows you to generate your language model to suit your specific needs.
12+
- **Google AI**: Generate audio from text; Access DeepMind’s speech models.
1213

1314
# How to use
1415

1516
1. Import the core jar file OR maven dependency (check the Integration section).
1617
2. Add Gson dependency if using the jar file; otherwise, it's handled by maven or Gradle.
17-
3. Call the ``RemoteLanguageModel`` for the language models and ``RemoteImageModel`` for image generation.
18+
3. Call the ``RemoteLanguageModel`` for the language models, ``RemoteImageModel`` for image generation and ``RemoteSpeechModel`` for text to speech models.
1819

1920
## Integration
2021
The package released to Maven Central Repository:
@@ -24,23 +25,23 @@ Maven:
2425
<dependency>
2526
<groupId>io.github.barqawiz</groupId>
2627
<artifactId>intellijava.core</artifactId>
27-
<version>0.6.2</version>
28+
<version>0.7.0</version>
2829
</dependency>
2930
```
3031

3132
Gradle:
3233

3334
```
34-
implementation 'io.github.barqawiz:intellijava.core:0.6.2'
35+
implementation 'io.github.barqawiz:intellijava.core:0.7.0'
3536
```
3637

3738
Gradle(Kotlin):
3839
```
39-
implementation("io.github.barqawiz:intellijava.core:0.6.2")
40+
implementation("io.github.barqawiz:intellijava.core:0.7.0")
4041
```
4142

4243
Jar download:
43-
[intellijava.jar](https://repo1.maven.org/maven2/io/github/barqawiz/intellijava.core/0.6.2/intellijava.core-0.6.2.jar).
44+
[intellijava.jar](https://repo1.maven.org/maven2/io/github/barqawiz/intellijava.core/0.6.2/intellijava.core-0.7.0.jar).
4445

4546
For ready integration: try the [sample_code](https://github.com/Barqawiz/IntelliJava/tree/main/sample_code).
4647

@@ -80,21 +81,21 @@ The only dependencies is **GSON**.
8081
For Maven:
8182
```
8283
<dependency>
83-
<groupId>com.google.code.gson</groupId>
84-
<artifactId>gson</artifactId>
85-
<version>2.8.9</version>
84+
<groupId>com.google.code.gson</groupId>
85+
<artifactId>gson</artifactId>
86+
<version>2.10.1</version>
8687
</dependency>
8788
```
8889

8990
For Gradle:
9091
```
9192
dependencies {
92-
implementation 'com.google.code.gson:gson:2.8.9'
93+
implementation 'com.google.code.gson:gson:2.10.1'
9394
}
9495
```
9596

9697
For jar download:
97-
[gson download repo](https://search.maven.org/artifact/com.google.code.gson/gson/2.8.9/jar)
98+
[gson download repo](https://search.maven.org/artifact/com.google.code.gson/gson/2.10.1/jar)
9899

99100
## Documentation
100101
[Go to Java docs](https://barqawiz.github.io/IntelliJava/javadocs/)
@@ -106,12 +107,11 @@ Call for contributors:
106107
- [ ] Add support to other OpenAI functions.
107108
- [x] Add support to cohere generate API.
108109
- [ ] Add support to Google language models.
110+
- [x] Add support to Google speech models.
109111
- [ ] Add support to Amazon language models.
110-
- [ ] Add support to Azure models.
112+
- [ ] Add support to Azure nlp models.
111113
- [ ] Add support to Midjourney image generation.
112114
- [ ] Add support to WuDao 2.0 model.
113-
- [ ] Add support to an audio model.
114-
115115

116116
# License
117117
Apache License

core/com.intellijava.core/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@
6666
<dependency>
6767
<groupId>com.google.code.gson</groupId>
6868
<artifactId>gson</artifactId>
69-
<version>2.8.9</version>
69+
<version>2.10.1</version>
7070
</dependency>
7171
</dependencies>
7272

core/com.intellijava.core/src/main/java/com/intellijava/core/controller/RemoteSpeechModel.java

Lines changed: 75 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,20 @@
1+
/**
2+
* Copyright 2023 Github.com/Barqawiz/IntelliJava
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
116
package com.intellijava.core.controller;
17+
218
import java.io.IOException;
319
import java.util.ArrayList;
420
import java.util.HashMap;
@@ -11,11 +27,32 @@
1127
import com.intellijava.core.utils.AudioHelper;
1228
import com.intellijava.core.wrappers.GoogleAIWrapper;
1329

30+
/**
31+
* RemoteSpeechModel class provides a remote speech model implementation.
32+
* It generates speech from text using the Wrapper classes.
33+
*
34+
* This version support google speech models only.
35+
*
36+
* To use Google speech services:
37+
* 1- Go to console.cloud.google.com.
38+
* 2- Enable "Cloud Text-to-Speech API".
39+
* 3- Generate API key from "Credentials" page.
40+
*
41+
* @author github.com/Barqawiz
42+
*/
1443
public class RemoteSpeechModel {
1544

1645
private SpeechModels keyType;
1746
private GoogleAIWrapper wrapper;
1847

48+
/**
49+
*
50+
* Constructs a new RemoteSpeechModel object with the specified key value and key type string.
51+
* If keyTypeString is empty, it is set to "google" by default.
52+
*
53+
* @param keyValue the API key value to use.
54+
* @param keyTypeString the string representation of the key type.
55+
*/
1956
public RemoteSpeechModel(String keyValue, String keyTypeString) {
2057

2158
if (keyTypeString.isEmpty()) {
@@ -33,16 +70,34 @@ public RemoteSpeechModel(String keyValue, String keyTypeString) {
3370
}
3471
}
3572

73+
/**
74+
*
75+
* Constructs a new RemoteSpeechModel object with the specified key value and key type.
76+
*
77+
* @param keyValue The API key value to use.
78+
* @param keyType The SpeechModels enum value representing the key type.
79+
*/
3680
public RemoteSpeechModel(String keyValue, SpeechModels keyType) {
3781
this.initiate(keyValue, keyType);
3882
}
3983

84+
/**
85+
* Initiate the object with the specified key value and key type.
86+
*
87+
* @param keyValue the API key value to use.
88+
* @param keyType the SpeechModels enum value representing the key type.
89+
*/
4090
private void initiate(String keyValue, SpeechModels keyType) {
4191

4292
this.keyType = keyType;
4393
wrapper = new GoogleAIWrapper(keyValue);
4494
}
4595

96+
/**
97+
* Get a list of supported key type models.
98+
*
99+
* @return list of the supported SpeechModels enum values.
100+
*/
46101
public List<String> getSupportedModels() {
47102
SpeechModels[] values = SpeechModels.values();
48103
List<String> enumValues = new ArrayList<>();
@@ -54,6 +109,15 @@ public List<String> getSupportedModels() {
54109
return enumValues;
55110
}
56111

112+
/**
113+
* Generates speech from text using the support models.
114+
*
115+
* You can save the returned byte to audio file using FileOutputStream("path/audio.mp3").
116+
*
117+
* @param input SpeechInput object containing the text and gender to use.
118+
* @return byte array of the decoded audio content.
119+
* @throws IOException in case of communication error.
120+
*/
57121
public byte[] generateEnglishText(SpeechInput input) throws IOException {
58122

59123
if (this.keyType == SpeechModels.google) {
@@ -63,14 +127,23 @@ public byte[] generateEnglishText(SpeechInput input) throws IOException {
63127
}
64128
}
65129

66-
private byte[] generateGoogleText(String text, Gender geneder, String language) throws IOException {
130+
/**
131+
* Generates speech from text using the Google Speech service API.
132+
*
133+
* @param text text to generate the speech.
134+
* @param gender gender to use (male or female).
135+
* @param language en-gb.
136+
* @return
137+
* @throws IOException in case of communication error.
138+
*/
139+
private byte[] generateGoogleText(String text, Gender gender, String language) throws IOException {
67140
byte[] decodedAudio = null;
68141

69142
Map<String, Object> params = new HashMap<>();
70143
params.put("text", text);
71144
params.put("languageCode", language);
72145

73-
if (geneder == Gender.FEMALE) {
146+
if (gender == Gender.FEMALE) {
74147
params.put("name", "en-GB-Standard-A");
75148
params.put("ssmlGender", "FEMALE");
76149
} else {

core/com.intellijava.core/src/main/java/com/intellijava/core/model/AudioResponse.java

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,55 @@
1+
/**
2+
* Copyright 2023 Github.com/Barqawiz/IntelliJava
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
116
package com.intellijava.core.model;
217

318
import com.google.gson.annotations.SerializedName;
419

20+
/**
21+
*
22+
* AudioResponse represents the response from the speech API that contains the audio content.
23+
*
24+
* @author github.com/Barqawiz
25+
*
26+
*/
527
public class AudioResponse extends BaseRemoteModel {
628

29+
/**
30+
* Default AudioResponse constructor.
31+
*/
32+
public AudioResponse() {}
33+
34+
/**
35+
* The audio content generated from a text.
36+
*/
737
@SerializedName("audioContent")
838
private String audioContent;
939

40+
/**
41+
* Gets the audio content generated from a text.
42+
* @return audio content as a base64 string.
43+
*/
1044
public String getAudioContent() {
1145
return audioContent;
1246
}
1347

48+
/**
49+
* Sets the audio content generated from a text.
50+
*
51+
* @param audioContent audio content as a base64 string.
52+
*/
1453
public void setAudioContent(String audioContent) {
1554
this.audioContent = audioContent;
1655
}
Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
package com.intellijava.core.model;
22

3+
/**
4+
* Supported speech models.
5+
*
6+
* @author github.com/Barqawiz
7+
*
8+
*/
39
public enum SpeechModels {
4-
google
10+
/** google model */google
511
}

core/com.intellijava.core/src/main/java/com/intellijava/core/model/input/ImageModelInput.java

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ public ImageModelInput build() {
110110
}
111111
}
112112
/**
113-
* Getter for prompt.
113+
* Getter for prompt the text of the required action or the question.
114114
* @return prompt
115115
*/
116116
public String getPrompt() {
@@ -146,7 +146,7 @@ public void setPrompt(String prompt) {
146146

147147
/**
148148
* Setter for numberOfImages.
149-
* @param numberOfImages
149+
* @param numberOfImages the number of the generated images.
150150
*/
151151
public void setNumberOfImages(int numberOfImages) {
152152
this.numberOfImages = numberOfImages;
@@ -156,7 +156,7 @@ public void setNumberOfImages(int numberOfImages) {
156156
/**
157157
* Setter for imageSize.
158158
*
159-
* @param imageSize
159+
* @param imageSize the size of the generated images, options are: 256x256, 512x512, or 1024x1024.
160160
*/
161161
public void setImageSize(String imageSize) {
162162
this.imageSize = imageSize;

core/com.intellijava.core/src/main/java/com/intellijava/core/model/input/LanguageModelInput.java

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,7 @@ public void setPrompt(String prompt) {
203203
/**
204204
* Setter for temperature.
205205
*
206-
* @param temperature
206+
* @param temperature higher values means more risks and creativity.
207207
*/
208208
public void setTemperature(float temperature) {
209209
this.temperature = temperature;
@@ -212,7 +212,7 @@ public void setTemperature(float temperature) {
212212
/**
213213
* Setter for maxTokens.
214214
*
215-
* @param maxTokens
215+
* @param maxTokens maximum size of the model input and output.
216216
*/
217217
public void setMaxTokens(int maxTokens) {
218218
this.maxTokens = maxTokens;
@@ -221,7 +221,7 @@ public void setMaxTokens(int maxTokens) {
221221
/**
222222
* Setter for numberOfOutputs.
223223
*
224-
* @param numberOfOutputs
224+
* @param numberOfOutputs number of model outputs, default value is 1.
225225
*/
226226
public void setNumberOfOutputs(int numberOfOutputs) {
227227
this.numberOfOutputs = numberOfOutputs;

0 commit comments

Comments
 (0)