Please follow the steps below to set up the Azure environment
Clone or download this repository and navigate to the project's root directory.
The button below will deploy secrets into the Azure Key Vault you are using for this solution:
Note: This deployment assumes you have created a Twiter developer account with "Elevated" access to the Twitter API's features (check out this link for more info on the access levels) and a News API account.
In order to deploy the secrets into the Azure Key Vault, we need to configure the following variables.
Populate the variables accordingly:
- Resource group: Use the
same resource groupwhere the previous ARM template was deployed. - Region: This field will be auto-filled
- Key Vault Name: The name of the Azure Key Vault that was created during previous steps
The Twitter API keys and tokens (4-7) can be obtained in your Twiter developer account
-
Twitter API Key
-
Twitter API Secret Key
-
Twitter Access Token
-
Twitter Access Token Secret
-
News API Key: can be obtained from newsapi.org
-
Text Analytics Key: The key for your Text Analytics resource that was created in previous steps
-
Text Analytics Endpoint: The endpoint for your Text Analytics resource that was created in previous steps
-
Text Analytics Region: The region your Text Analytics resouce is in
- Translator Key: The key to your Translator resource that was created in previous steps
- Translator Endpoint: The endpoint to your Translator resource that was created in previous steps
- Translator Region: The region your Translator resource is in
- Map Key: The key to your Map resource that was created in previous steps
- Go to the Key Vault that was created in the previous step
- Click
Access policies, click+ Add Access Policy, on the new window, on theSecret PermissionsselectGetandList. - On the
Select principaloption, add your synapse resource name to be added to the Key vault and clickSave - Select
Review + createandSavethe changes made.
Before you can upload assets to the Synapse Workspace you will need to add your IP address:
- Go to the Synapse resouce you created in the previous step.
- Navigate to
NetworkingunderSecurityon the left hand side of the page. - At the middle of the screen click
+ Add client IP
- Your IP address should now be visible in the IP list
In order to perform the necessary actions in Synapse workspace, you will need to grant more access.
- Go to the Azure Data Lake Storage Account for your Synapse Workspace
- Go to the
Access Control (IAM) > + Add > Add role assignment - Now search and select the
Storage Blob Data Contributorrole and click "Next" - Click "+ Select members", search and select your username and click "Select"
- Click
Review and assignat the bottom
- Launch the Synapse workspace Synapse Workspace
- Select the
subscriptionandworkspacename you are using for this solution accelerator - Navigate to the
Managetab in the Studio and click on theApache Spark pools

- Click
...on the deployed Spark Pool and selectPackages - Click
Uploadand select requirements.txt from the cloned repo. - Click
Apply
- Launch the Synapse workspace Synapse Workspace
- Select the
subscriptionandworkspacename you are using for this solution accelerator - Navigate to the
ManageHub, under "External connection" clickLinked services - Click
+ New, selectAzure Key Vault, select the subscription you are using for this solution from the "Azure Subscription" dropdown, and select your Azure Key Vault name from the "Azure key vault name" dropdown - Change the name of the linked service to
KeyVaultLinkedService - Click "Test connection" and click "Save"
- Launch the Synapse workspace Synapse Workspace
- Select the
subscriptionandworkspacename you are using for this solution accelerator - In Synapse Studio, navigate to the
DataHub - Select
Linked - Under the category
Azure Data Lake Storage Gen2you'll see an item with a name likexxxxx(xxxxx- Primary) - Select the container named
socialmediaadlsfs (Primary), select "New folder", enterCountryCoordinatesand select "Create"- In the
CountryCoordinatesfolder, selectUploadto upload the .csv file CountryCoordinates.csv
- In the
- Launch the Synapse workspace Synapse Workspace
- Go to
Develop, click the+, and clickImportto select all notebooks from this repository's folder - For each of the notebooks, select
Attach to > spark1in the top dropdown
- In Synapse workspace, go to
Integrate, click the "+", and choosePipeline - The new
Pipeline 1appears, click on the three dots...on the right corner , clickRename, change the pipeline name toProcess_News_and_Twitter_Data_Pipeline - Click the
{}button at the top right corner to open the Code window - Copy and paste the contents of Process_News_and_Twitter_Data_Pipeline.json
- Click
OKto apply. - Click
Publish allat the top of the page. - Click
Add trigger, selectTrigger nowTrigger the pipeline and populate the parameters as shown below: Alternatively, add a scheduled trigger to run the pipeline on a daily basis
-
Open the Power BI report template in this repository
-
Enter the Synapse Serverless SQL endpoint and
defaultfor SQLPool/database name when prompted- Navigate to the Synapse Workspace overview page in the Azure Portal, copy the Serverless SQL endpoint
-
Select
Refreshafter all the tables load and the dashboard with three pages shows up.




