2020-03-03 08:24:24

Preliminaries

Announcements

  • No class next seek (Spring Break)
  • HW: Complete a 1-2 page write-up describing your plans for your final project. (Commit it to the repo you’re using for that work.)

  • Women+ in Statistics and Data Science Group
    • Seeking mentors and mentees application
    • Symposium on Statistics and Data Science (in Pgh), submissions due March 10

Today’s topics

  • Why you must plot first
  • Interactive web apps with Shiny

Why you must plot first

Anscombe’s Quartet

Datasaurus dozen (Matejka & Fitzmaurice)

Shiny

Background

  • RStudio framework for making interactive, web-based data applications
  • Basic tutorial
  • Gallery

Components

  • User interface (UI)
  • Server instructions
  • Web server
  • Web browser

Template

library(shiny)
ui <- fluidPage()
server <- function(input, output) {}
shinyApp(ui = ui, server = server)

Correlation demo

  • Goal: Create an app to visualize the effects of changing the slope (m) and error (sd) in a linear relation

\[Y \sim X * \beta + N(0,\sigma)\]

or

\[ Y \sim X_0*\beta_0 + X_1*\beta_1 + N(0,\sigma)\]

  • Inputs
    • number of points
    • slope
    • error
  • Outputs
    • scatter plot with fitted line
    • correlation value

## PhantomJS not found. You can install it with webshot::install_phantomjs(). If it is installed, please make sure the phantomjs executable can be found via the PATH variable.
Shiny applications not supported in static R Markdown documents

UI input components

  • Sliders for numeric input use sliderInput() function
  • Parameters inputId, label, min, max, and value
sliderInput(inputId = "points",
  label = "Number of points:",
  min = 10,
  max = 200,
  value = 50)

Sliders for slope, error

sliderInput(inputId = "slope",
  label = "Slope:",
  min = -10,
  max = 10,
  value = 1)
      
sliderInput(inputId = "error",
  label = "Error:",
  min = .0001,
  max = 5,
  value = 0.5)

Under the hood

  • Adjust the slider \(\rightarrow\) the sliderInput function converts the slider position into a number
    • That number gets assigned to the inputId of the UI element
    • A list of elements with the inputId equal to the variable name(s) gets created: input$points, input$slope, …

  • A server(input, output) function takes the ui inputs and creates an output
  • The UI takes the output and shows it on the screen

Server function

  • Syntax is server <- function(input, output) {}
  • Use input$points, input$slope, and input$error to grab values from UI
  • Save plot to output$scatterPlot

Calculating points and plotting them

  • Generate n = input$points as x
  • Calculate y = x * input$slope + error
    • where error = rnorm(input$points, sd = input$error)
  • Plot x, y

output$scatterPlot <- renderPlot({
    # Calculate x, y, with slope and error
    x = runif(input$points)
    
    # Vectorize x for point-wise multiplication
    y = rep(input$slope, input$points) * as.vector(x) + 
        rnorm(input$points, sd = input$error)
        
    # draw the histogram with the specified number of bins
    scatter.smooth(x = x, y = y, xlab = "x", ylab = "y")
})

The whole shebang

library(shiny)

# Define UI for application that draws a scatterplot
ui <- fluidPage(
  
  # Application title
  titlePanel("Correlation Demo"),
  
  # Sidebar with a slider input for number of points 
  sidebarLayout(
    sidebarPanel(
      sliderInput(inputId = "points",
                  label = "Number of points:",
                  min = 10,
                  max = 200,
                  value = 50),
      sliderInput(inputId = "slope",
                  label = "Slope:",
                  min = -10,
                  max = 10,
                  value = 1),
      sliderInput(inputId = "error",
                  label = "Error:",
                  min = .0001,
                  max = 5,
                  value = 0.5)
    ),
    
    # Show a plot of the generated distribution
    mainPanel(
      plotOutput(outputId = "scatterPlot")
    )
  )
)

# Define server logic
server <- function(input, output) {
  
  output$scatterPlot <- renderPlot({
    
    # Calculate x, y, with slope and error
    x = runif(input$points)
    y = rep(input$slope, input$points) * as.vector(x) + rnorm(input$points, sd = input$error)
    
    # draw the plot
    scatter.smooth(x = x, y = y, xlab = "x", ylab = "y")
  })
}

Why do this?

  • Fun, informative way to simulate before you run your study
  • Strong hypothesis generation, prediction
  • Interactive report
  • Interactive publication!

Let’s explore

Learning more

Your turn

Complete a series of “quests”

Next time…

  • Python!