Skip to content

Commit 95328ba

Browse files
committed
typos
1 parent 4db34bb commit 95328ba

File tree

4 files changed

+39
-29
lines changed

4 files changed

+39
-29
lines changed

articles/azure-video-indexer/audio-effects-detection-overview.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -12,17 +12,17 @@ ms.topic: article
1212

1313
# Audio effects detection
1414

15-
Audio effects detection is an Azure Video Indexer feature that detects insights on a variety of acoustic events and classifies them into acoustic categories. Audio effect detection can detect and classify different categories such as laughter, crowd reactions, alarms and/or sirens.
15+
Audio effects detection is an Azure Video Indexer feature that detects insights on various acoustic events and classifies them into acoustic categories. Audio effect detection can detect and classify different categories such as laughter, crowd reactions, alarms and/or sirens.
1616

17-
When working on the website, the instances are displayed in the Insights tab. They can also be generated in a categorized list in a JSON file which includes the category ID, type, name, and instances per category together with the specific timeframes and confidence score.
17+
When working on the website, the instances are displayed in the Insights tab. They can also be generated in a categorized list in a JSON file that includes the category ID, type, name, and instances per category together with the specific timeframes and confidence score.
1818

1919
## Prerequisites
2020

2121
Review [transparency note overview](/legal/azure-video-indexer/transparency-note?context=/azure/azure-video-indexer/context/context)
2222

2323
## General principles
2424

25-
This article discusses audio effects detection and the key considerations for making use of this technology responsibly. There are a number of things you need to consider when deciding how to use and implement an AI-powered feature:
25+
This article discusses audio effects detection and the key considerations for making use of this technology responsibly. There are many things you need to consider when deciding how to use and implement an AI-powered feature:
2626

2727
* Will this feature perform well in my scenario? Before deploying audio effects detection into your scenario, test how it performs using real-life data and make sure it can deliver the accuracy you need.
2828
* Are we equipped to identify and respond to errors? AI-powered products and features won't be 100% accurate, so consider how you'll identify and respond to any errors that may occur.
@@ -36,7 +36,7 @@ To see the instances on the website, do the following:
3636

3737
To display the JSON file, do the following:
3838

39-
1. Click Download -> Insights (JSON).
39+
1. Select Download -> Insights (JSON).
4040
1. Copy the `audioEffects` element, under `insights`, and paste it into your Online JSON viewer.
4141

4242
```json
@@ -81,40 +81,40 @@ During the audio effects detection procedure, audio in a media file is processed
8181
|Source file | The user uploads the source file for indexing. |
8282
|Segmentation| The audio is analyzed, non-speech audio is identified and then split into short overlapping internals. |
8383
|Classification| An AI process analyzes each segment and classifies its contents into event categories such as crowd reaction or laughter. A probability list is then created for each event category according to department-specific rules. |
84-
|Confidence level| The estimated confidence level of each audio effect is calculated as a range of 0 to 1. The confidence score represents the certainty in the accuracy of the result. For example, an 82% certainty will be represented as an 0.82 score.|
84+
|Confidence level| The estimated confidence level of each audio effect is calculated as a range of 0 to 1. The confidence score represents the certainty in the accuracy of the result. For example, an 82% certainty is represented as an 0.82 score.|
8585

8686
## Example use cases
8787

88-
- Companies with a large video archive can improve accessibility by offering more context for a hearing- impaired audience by transcription of non-speech effects.
89-
- Improved efficiency when creating raw data for content creators. Important moments in promos and trailers such as laughter, crowd reactions, gunshots, or explosions can be identified for example in Media and Entertainment.
88+
- Companies with a large video archive can improve accessibility by offering more context for a hearing- impaired audience by transcription of nonspeech effects.
89+
- Improved efficiency when creating raw data for content creators. Important moments in promos and trailers such as laughter, crowd reactions, gunshots, or explosions can be identified, for example, in Media and Entertainment.
9090
- Detecting and classifying gunshots, explosions, and glass shattering in a smart-city system or in other public environments that include cameras and microphones to offer fast and accurate detection of violence incidents.
9191

9292
## Considerations and limitations when choosing a use case
9393

94-
- Avoid use of very short or low-quality audio, audio effects detection provides probabilistic and partial data on detected non-speech audio events. For accuracy, audio effects detection requires at least 2 seconds of clear non-speech audio. Voice commands or singing are not supported.  
95-
- Avoid use of audio with very loud background music or music with repetitive and/or linearly scanned frequency, audio effects detection is designed for non-speech audio only and therefore cannot classify events in loud music. Music with repetitive and/or linearly scanned frequency many be incorrectly classified as an alarm or siren.
94+
- Avoid use of very short or low-quality audio, audio effects detection provides probabilistic and partial data on detected nonspeech audio events. For accuracy, audio effects detection requires at least 2 seconds of clear nonspeech audio. Voice commands or singing aren't supported.  
95+
- Avoid use of audio with loud background music or music with repetitive and/or linearly scanned frequency, audio effects detection is designed for nonspeech audio only and therefore can't classify events in loud music. Music with repetitive and/or linearly scanned frequency many be incorrectly classified as an alarm or siren.
9696
- Carefully consider the methods of usage in law enforcement and similar institutions, to promote more accurate probabilistic data, carefully review the following:
9797

98-
- Audio effects can be detected in non-speech segments only.
99-
- The duration of a non-speech section should be at least 2 seconds.
98+
- Audio effects can be detected in nonspeech segments only.
99+
- The duration of a nonspeech section should be at least 2 seconds.
100100
- Low quality audio might impact the detection results.
101-
- Events in loud background music are not classified.
101+
- Events in loud background music aren't classified.
102102
- Music with repetitive and/or linearly scanned frequency might be incorrectly classified as an alarm or siren.
103-
- Knocking on a door or slamming a door might be labelled as a gunshot or explosion.
103+
- Knocking on a door or slamming a door might be labeled as a gunshot or explosion.
104104
- Prolonged shouting or sounds of physical human effort might be incorrectly classified.
105105
- A group of people laughing might be classified as both laughter and crowd.
106-
- Natural and non-synthetic gunshot and explosions sounds are supported.
106+
- Natural and nonsynthetic gunshot and explosions sounds are supported.
107107

108108
When used responsibly and carefully, Azure Video Indexer is a valuable tool for many industries. To respect the privacy and safety of others, and to comply with local and global regulations, we recommend the following:  
109109

110110
- Always respect an individual’s right to privacy, and only ingest audio for lawful and justifiable purposes.  
111-
- Do not purposely disclose inappropriate audio of young children or family members of celebrities or other content that may be detrimental or pose a threat to an individual’s personal freedom.  
111+
- Don't purposely disclose inappropriate audio of young children or family members of celebrities or other content that may be detrimental or pose a threat to an individual’s personal freedom.  
112112
- Commit to respecting and promoting human rights in the design and deployment of your analyzed audio.  
113113
- When using 3rd party materials, be aware of any existing copyrights or permissions required before distributing content derived from them. 
114114
- Always seek legal advice when using audio from unknown sources. 
115115
- Be aware of any applicable laws or regulations that exist in your area regarding processing, analyzing, and sharing audio containing people. 
116-
- Keep a human in the loop. Do not use any solution as a replacement for human oversight and decision-making.  
117-
- Fully examine and review the potential of any AI model you are using to understand its capabilities and limitations. 
116+
- Keep a human in the loop. Don't use any solution as a replacement for human oversight and decision-making.  
117+
- Fully examine and review the potential of any AI model you're using to understand its capabilities and limitations. 
118118

119119
## Next steps
120120

articles/azure-video-indexer/insights-overview.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,18 @@ ms.author: juliako
1010

1111
When a video is indexed, Azure Video Indexer analyzes the video and audio content by running 30+ AI models, generating rich insights. Insights contain an aggregated view of the data: transcripts, optical character recognition elements (OCRs), face, topics, emotions, etc. Once the video is indexed and analyzed, Azure Video Indexer produces a JSON content that contains details of the video insights. For example, each insight type includes instances of time ranges that show when the insight appears in the video.
1212

13+
Read details about the following insights here:
14+
15+
- [Audio effects detection](audio-effects-detection-overview.md)
16+
- [Faces detection](face-detection.md)
17+
- [OCR](ocr.md)
18+
- [Keywords extraction](keywords.md)
19+
- [Transcription, translation, language](transcription-translation-lid.md)
20+
- [Labels identification](labels-identification.md)
21+
- [Named entities](named-entities.md)
22+
- [Observed people tracking & matched faces](observed-matched-people.md)
23+
- [Topics inference](topics-inference.md)
24+
1325
For information about features and other insights, see:
1426

1527
- [Azure Video Indexer overview](video-indexer-overview.md)

articles/azure-video-indexer/keywords.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,10 @@ Review [Transparency Note overview](/legal/azure-video-indexer/transparency-note
2020

2121
## General principles
2222

23-
This article discusses Keywords and the key considerations for making use of this technology responsibly. There are a number of things you need to consider when deciding how to use and implement an AI-powered feature:
23+
This article discusses Keywords and the key considerations for making use of this technology responsibly. There are many things you need to consider when deciding how to use and implement an AI-powered feature:
2424

2525
- Will this feature perform well in my scenario? Before deploying Keywords Extraction into your scenario, test how it performs using real-life data and make sure it can deliver the accuracy you need.
26-
- Are we equipped to identify and respond to errors? AI-powered products and features will not be 100% accurate, so consider how you will identify and respond to any errors that may occur.
26+
- Are we equipped to identify and respond to errors? AI-powered products and features won't be 100% accurate, so consider how you'll identify and respond to any errors that may occur.
2727

2828
## View the insight
2929

@@ -112,20 +112,20 @@ During the Keywords procedure, audio and images in a media file are processed, a
112112
Below are some considerations to keep in mind when using keywords extraction:
113113

114114
- When uploading a file always use high-quality video content. The recommended maximum frame size is HD and frame rate is 30 FPS. A frame should contain no more than 10 people. When outputting frames from videos to AI models, only send around 2 or 3 frames per second. Processing 10 and more frames might delay the AI result.
115-
- When uploading a file always use high quality audio and video content. At least 1 minute of spontaneous conversational speech is required to perform analysis. Audio effects are detected in non-speech segments only. The minimal duration of a non-speech section is 2 seconds. Voice commands and singing are not supported. 
115+
- When uploading a file always use high quality audio and video content. At least 1 minute of spontaneous conversational speech is required to perform analysis. Audio effects are detected in non-speech segments only. The minimal duration of a non-speech section is 2 seconds. Voice commands and singing aren't supported. 
116116

117117
When used responsibly and carefully Keywords is a valuable tool for many industries. To respect the privacy and safety of others, and to comply with local and global regulations, we recommend the following:  
118118

119119
- Always respect an individual’s right to privacy, and only ingest media for lawful and justifiable purposes.  
120-
- Do not purposely disclose inappropriate media showing young children or family members of celebrities or other content that may be detrimental or pose a threat to an individual’s personal freedom.  
120+
- Don't purposely disclose inappropriate media showing young children or family members of celebrities or other content that may be detrimental or pose a threat to an individual’s personal freedom.  
121121
- Commit to respecting and promoting human rights in the design and deployment of your analyzed media.  
122122
- When using 3rd party materials, be aware of any existing copyrights or permissions required before distributing content derived from them. 
123123
- Always seek legal advice when using media from unknown sources. 
124124
- Always obtain appropriate legal and professional advice to ensure that your uploaded media is secured and have adequate controls to preserve the integrity of your content and to prevent unauthorized access.    
125125
- Provide a feedback channel that allows users and individuals to report issues with the service.  
126126
- Be aware of any applicable laws or regulations that exist in your area regarding processing, analyzing, and sharing media containing people. 
127-
- Keep a human in the loop. Do not use any solution as a replacement for human oversight and decision-making.  
128-
- Fully examine and review the potential of any AI model you are using to understand its capabilities and limitations. 
127+
- Keep a human in the loop. Don't use any solution as a replacement for human oversight and decision-making.  
128+
- Fully examine and review the potential of any AI model you're using to understand its capabilities and limitations. 
129129

130130
## Next steps
131131

0 commit comments

Comments
 (0)