Azure ML studio and tableau

For the first one download the below Tableau workbook file and complete the instructions within using the embedded data (no dataset upload required). Send completed Tableau workbook.

For the second assignment create a basic machine learning modeling experiment in Azure ML Studio to predict one of the labels

  • Potential features:
      • Gender: of the person who posted the Tweet
      • Country or State: of the location where the Tweet originated from
      • Weekday, Day, Hour: of the date it was tweeted
      • Klout: a score representing how “popular” or “important” the person is who posted the tweet
      • Sentiment: a score representing the tone of the tweet text
      • Reach: how many people had viewed the tweet at the time the data was collected
      • IsReshare: whether or not the tweet was a reshared of another tweet
      • RetweetCount: the number of “Retweets” other users had given the tweet
      • Likes: the number of “Likes” other users had given the tweet
      • Lang: the language that the tweet was written in
  • Candidate labels: Each of these features might represent the popularity or impact of a tweet. However, you can only use one. Your goal is to select a label that is 1) as meaningful as possible, and 2) as easy to predict with strong accuracy and fit metrics as possible. However, you’ll find that those objects can conflict with each other at times: more of one may mean less of the other. Choose carefully.
    • Reach
    • IsReshare
    • RetweetCount
    • Likes

Requirements:

  • Build an experiment in Azure ML Studio to predict one of the candidate labels listed above or some derived version of those labels.
  • Follow the pattern and techniques learned in this module to select columns, split the data into a training and testing set, and then train, score, and evaluate the model
    • You should select/include any feature that you think should logically explain or predict your label.
  • Use linear regression to train the model.
    • However, you will learn later that there are other algorithms available that are better suited to count-based data like RetweetCount, Likes, and Reach. But don’t worry about that for now.
  • Complete any relevant data preparation tasks demonstrated in the textbook chapters and in class (minimum 3 types).