You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Codes/README.md
+13-1Lines changed: 13 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,11 +22,23 @@
22
22
# To Run the unimodal Vision Based models
23
23
24
24
6.Vision+lstm_foldWise.py
25
-
7.3DCNN_withFolds.py
25
+
7.3DCNN_withFolds.py
26
26
27
27
# To Run the Multimodal Model
28
28
29
29
9. MultiModalFusionModelfoldWise.py
30
30
31
31
# To extract all the video frames.
32
32
frameExtract.py
33
+
34
+
# Extraction of transcript
35
+
36
+
The 'all__video_vosk_audioMap.p' has to be generated using the Vosk speech recognition toolkit(https://alphacephei.com/vosk/). The format of the file is in JSON format like the below:
0 commit comments