You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/component-reference/convert-word-to-vector.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -129,7 +129,7 @@ This section contains tips and answers to frequently asked questions.
129
129
130
130
+ Difference between online-training and pretrained model:
131
131
132
-
In this Convert Word to Vector component, we provided three different strategies: two online-training models and one pretrained model. The online-training models use your input dataset as training data, and generate vocabulary and word vectors during training. The pretrained model is already trained by a much larger text corpus, such as Wikipedia or Twitter text. The pretrained model is actually a collection of word/embedding pairs.
132
+
In this Convert Word to Vector component, we provided three different strategies: two online-training models and one pretrained model. The online-training models use your input dataset as training data, and generate vocabulary and word vectors during training. The pretrained model is already trained by a much larger text corpus, such as Wikipedia or X text. The pretrained model is actually a collection of word/embedding pairs.
133
133
134
134
The GloVe pre-trained model summarizes a vocabulary from the input dataset and generates an embedding vector for each word from the pretrained model. Without online training, the use of a pretrained model can save training time. It has better performance, especially when the input dataset size is relatively small.
Copy file name to clipboardExpand all lines: articles/machine-learning/v1/samples-designer.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -124,8 +124,8 @@ The sample datasets are available under **Datasets**-**Samples** category. You c
124
124
|CRM Upselling Labels Shared|Labels from the KDD Cup 2009 customer relationship prediction challenge ([orange_large_train_upselling.labels](https://kdd.org/cupfiles/KDDCupData/2009/orange_small_train_upselling.labels)|
125
125
|Flight Delays Data|Passenger flight on-time performance data taken from the TranStats data collection of the U.S. Department of Transportation ([On-Time](https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time)).<br/>The dataset covers the time period April-October 2013. Before uploading to the designer, the dataset was processed as follows: <br/>- The dataset was filtered to cover only the 70 busiest airports in the continental US <br/>- Canceled flights were labeled as delayed by more than 15 minutes <br/>- Diverted flights were filtered out <br/>- The following columns were selected: Year, Month, DayofMonth, DayOfWeek, Carrier, OriginAirportID, DestAirportID, CRSDepTime, DepDelay, DepDel15, CRSArrTime, ArrDelay, ArrDel15, Canceled|
126
126
|German Credit Card UCI dataset|The UCI Statlog (German Credit Card) dataset ([Statlog+German+Credit+Data](https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data)), using the german.data file.<br/>The dataset classifies people, described by a set of attributes, as low or high credit risks. Each example represents a person. There are 20 features, both numerical and categorical, and a binary label (the credit risk value). High credit risk entries have label = 2, low credit risk entries have label = 1. The cost of misclassifying a low risk example as high is 1, whereas the cost of misclassifying a high risk example as low is 5.|
127
-
|IMDB Movie Titles|The dataset contains information about movies that were rated in Twitter tweets: IMDB movie ID, movie name, genre, and production year. There are 17K movies in the dataset. The dataset was introduced in the paper "S. Dooms, T. De Pessemier and L. Martens. MovieTweetings: a Movie Rating Dataset Collected From Twitter. Workshop on Crowdsourcing and Human Computation for Recommender Systems, CrowdRec at RecSys 2013."|
128
-
|Movie Ratings|The dataset is an extended version of the Movie Tweetings dataset. The dataset has 170K ratings for movies, extracted from well-structured tweets on Twitter. Each instance represents a tweet and is a tuple: user ID, IMDB movie ID, rating, timestamp, number of favorites for this tweet, and number of retweets of this tweet. The dataset was made available by A. Said, S. Dooms, B. Loni and D. Tikk for Recommender Systems Challenge 2014.|
127
+
|IMDB Movie Titles|The dataset contains information about movies that were rated in X tweets: IMDB movie ID, movie name, genre, and production year. There are 17K movies in the dataset. The dataset was introduced in the paper "S. Dooms, T. De Pessemier and L. Martens. MovieTweetings: a Movie Rating Dataset Collected From Twitter. Workshop on Crowdsourcing and Human Computation for Recommender Systems, CrowdRec at RecSys 2013."|
128
+
|Movie Ratings|The dataset is an extended version of the Movie Tweetings dataset. The dataset has 170K ratings for movies, extracted from well-structured tweets on X. Each instance represents a tweet and is a tuple: user ID, IMDB movie ID, rating, timestamp, number of favorites for this tweet, and number of retweets of this tweet. The dataset was made available by A. Said, S. Dooms, B. Loni and D. Tikk for Recommender Systems Challenge 2014.|
129
129
|Weather Dataset|Hourly land-based weather observations from NOAA ([merged data from 201304 to 201310](https://az754797.vo.msecnd.net/data/WeatherDataset.csv)).<br/>The weather data covers observations made from airport weather stations, covering the time period April-October 2013. Before uploading to the designer, the dataset was processed as follows: <br/> - Weather station IDs were mapped to corresponding airport IDs <br/> - Weather stations not associated with the 70 busiest airports were filtered out <br/> - The Date column was split into separate Year, Month, and Day columns <br/> - The following columns were selected: AirportID, Year, Month, Day, Time, TimeZone, SkyCondition, Visibility, WeatherType, DryBulbFarenheit, DryBulbCelsius, WetBulbFarenheit, WetBulbCelsius, DewPointFarenheit, DewPointCelsius, RelativeHumidity, WindSpeed, WindDirection, ValueForWindCharacter, StationPressure, PressureTendency, PressureChange, SeaLevelPressure, RecordType, HourlyPrecip, Altimeter|
130
130
|Wikipedia SP 500 Dataset|Data is derived from Wikipedia (https://www.wikipedia.org/) based on articles of each S&P 500 company, stored as XML data. <br/>Before uploading to the designer, the dataset was processed as follows: <br/> - Extract text content for each specific company <br/> - Remove wiki formatting <br/> - Remove non-alphanumeric characters <br/> - Convert all text to lowercase <br/> - Known company categories were added <br/>Note that for some companies an article could not be found, so the number of records is less than 500.|
131
131
|Restaurant Feature Data| A set of metadata about restaurants and their features, such as food type, dining style, and location. <br/>**Usage**: Use this dataset, in combination with the other two restaurant datasets, to train and test a recommender system.<br/> **Related Research**: Bache, K. and Lichman, M. (2013). [UCI Machine Learning Repository](https://archive.ics.uci.edu/). Irvine, CA: University of California, School of Information and Computer Science.|
Copy file name to clipboardExpand all lines: articles/migrate/appcat/faq.yml
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ title: Frequently asked questions about Azure Migrate application and code asses
11
11
summary: |
12
12
This article provides answers to some of the most common questions about [Azure Migrate application and code assessment](overview.md).
13
13
If your Azure issue isn't addressed in this article, visit the Azure forums on [Microsoft Q & A and Stack Overflow](https://azure.microsoft.com/support/forums/).
14
-
You can post your issue in these forums, or post to [@AzureSupport on Twitter](https://twitter.com/AzureSupport).
14
+
You can post your issue in these forums, or post to [@AzureSupport on X](https://x.com/azuresupport).
15
15
You can also submit an Azure support request. To submit a support request, on the [Azure support page](https://azure.microsoft.com/support/options/), select **Get support**.
Copy file name to clipboardExpand all lines: articles/open-datasets/dataset-san-francisco-safety.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,7 +43,7 @@ This dataset is stored in the East US Azure region. Allocating compute resources
43
43
| dateTime | timestamp | 6,496,563 | 2020-10-19 12:28:08 2020-07-28 06:40:26 | The date and time when the service request was made or when the fire call was received. |
44
44
| latitude | double | 1,615,369 | 37.777624238929 37.786117211838 | Latitude of the location, using the WGS84 projection. |
45
45
| longitude | double | 1,554,612 | -122.39998111124 -122.419854245692 | Longitude of the location, using the WGS84 projection. |
46
-
| source | string | 9 | Phone Mobile/Open311 | Mechanism or path by which the service request was received; typically “Phone”, “Text/SMS”, “Website”, “Mobile App”, “Twitter”, etc. but terms may vary by system. |
46
+
| source | string | 9 | Phone Mobile/Open311 | Mechanism or path by which the service request was received; typically “Phone”, “Text/SMS”, “Website”, “Mobile App”, “X”, etc. but terms may vary by system. |
47
47
| status | string | 3 | Closed Open | A single-word indicator of the current state of the service request. (Note: GeoReport V2 only permits “open” and “closed”) |
48
48
| subcategory | string | 1,270 | Medical Incident Bulky Items | The human readable name of the service request subtype for 311 cases or call type for 911 fire calls. |
0 commit comments