<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Irene Steves</title>
    <link>https://irene.rbind.io/</link>
    <description>Recent content on Irene Steves</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <copyright>© This post is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License，please cite if you wish to quote or reproduce.</copyright>
    <lastBuildDate>Sun, 09 Aug 2020 00:00:00 +0000</lastBuildDate>
    
	<atom:link href="https://irene.rbind.io/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>About</title>
      <link>https://irene.rbind.io/about/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/about/</guid>
      <description>My background is in conservation and ecology, through which I first discovered the R and data science. Through a series of &amp;ldquo;accidents&amp;rdquo;, I ended up in the Middle East, where I&amp;rsquo;ve now found a home among other R enthusiasts in Tel Aviv.
In the last year, I&amp;rsquo;ve co-organized and given data science workshops at the Zoology Department at Tel Aviv University. Our materials are all hosted on GitHub and free for reuse (and improvement).</description>
    </item>
    
    <item>
      <title>The nitty-gritty of the Label Propagation Algorithm</title>
      <link>https://irene.rbind.io/post/lpa/</link>
      <pubDate>Sun, 09 Aug 2020 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/lpa/</guid>
      <description>Sometimes to learn something, you just have to implement it yourself. In this post, I try my hand at approximating the Label Propagation Algorithm (LPA) proposed by Raghavan et al. in their paper, Near linear time algorithm to detect community structures in large-scale networks. It’s a fairly common (fast!) community detection algorithm that is implemented in igraph (C-based network analysis library with interfaces in R, Python, and Mathematica), GraphFrames (built on Spark, with APIs in Scala, Java, and Python–plus an R API created by RStudio), and in other places.</description>
    </item>
    
    <item>
      <title>Journey through DB Connect installation hell</title>
      <link>https://irene.rbind.io/post/db-connect-install/</link>
      <pubDate>Sun, 19 Jul 2020 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/db-connect-install/</guid>
      <description>Requirements Step 0: install Java 8 Step 1: Install the client Step 2: Configure connection properties Step 3: set up in R Conclusion   Using Databricks notebooks on their platform is not so bad, but once you want to access the super power of Spark from your local RStudio, you’ve got to prepare yoruself for some installation hell. This post augments the Databricks official documentation and spells out the steps that tripped me up.</description>
    </item>
    
    <item>
      <title>Using SQL in RStudio</title>
      <link>https://irene.rbind.io/post/using-sql-in-rstudio/</link>
      <pubDate>Wed, 29 Apr 2020 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/using-sql-in-rstudio/</guid>
      <description>My starting point Previewing SQL in RStudio 1. Preview a .sql file 2. SQL chunks in RMarkdown  Passing variables to/from SQL chunks SQL output as a variable Providing query parameters  SQL files meet chunks R &amp;amp; SQL – working hand-in-hand   In the last year, SQL has wound its way deeper and deeper into my R workflow. I switch between the two every day, but up to now, I’ve been slow diving into the SQL tools RStudio provides.</description>
    </item>
    
    <item>
      <title>Window functions - a SQL versus R example</title>
      <link>https://irene.rbind.io/post/window-functions/</link>
      <pubDate>Fri, 17 Apr 2020 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/window-functions/</guid>
      <description>The problem SQL solution R (tidyverse) solution Syntax comparison   I recently drilled down into window functions in SQL, so here’s a quick example comparing some of the syntax differences between SQL and R.
The problem We’ll start with sequence of 10 orders with an order id and amount spent ($):
library(tidyverse) library(dbplyr) #for simulating a database library(slider) #for sliding window functions sample_orders &amp;lt;- tibble(o_id = 101:110, spent = round(runif(10, 5, 100), digits = 2))   o_id  spent      101  80.</description>
    </item>
    
    <item>
      <title>Windowed rank functions</title>
      <link>https://irene.rbind.io/post/window-rank-functions/</link>
      <pubDate>Fri, 17 Apr 2020 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/window-rank-functions/</guid>
      <description>In my recent exploration of window functions, I realized didn’t really know the differences between rank functions. The dplyr documentation lists out six functions, of which I pretty much only use one (row_number()):
 row_number() ntile() min_rank() dense_rank() percent_rank() cume_dist()  Though the documentation description is relatively clear, it was still hard to grasp exactly how they differed. I found it easier to do the comparison visually.
Trivia score ranks Given a toy dataset of trivia scores from two teams, let’s see how the scores rank using the functions above.</description>
    </item>
    
    <item>
      <title>Visual notes - intro to igraph</title>
      <link>https://irene.rbind.io/post/intro-to-igraph/</link>
      <pubDate>Fri, 13 Mar 2020 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/intro-to-igraph/</guid>
      <description>I’ve worked with igraph a few times now, but I usually dive straight into what I want to do and bash my way through. Recently I decided to review the fundamentals…annotating and diagramming to help me remember the terminology and concepts. This post is a collection of those visual notes.
 I’ve worked with igraph a few times now, but I usually dive straight into what I want to do and bash my way through.</description>
    </item>
    
    <item>
      <title>Who you gonna call? R processes!</title>
      <link>https://irene.rbind.io/post/callr/</link>
      <pubDate>Sat, 15 Jun 2019 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/callr/</guid>
      <description>Aside from the workshops and talks at #rstudioconf back in January, I picked up many useful nuggets from the stream of conference tweets. I was particularly struck by this one:
Tired: Load R scripts with source()
Wired: Load R scripts with callr https://t.co/x2wxIOdmU0
&amp;mdash; Travis Gerke (@travisgerke) January 18, 2019  Not feeling the amazement? Okay, let me explain…
Loyal readers may remember my previous blogpost, in which I described starting a new R process to run a plumber API for testing.</description>
    </item>
    
    <item>
      <title>Encoding in R</title>
      <link>https://irene.rbind.io/post/encoding-in-r/</link>
      <pubDate>Sat, 16 Mar 2019 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/encoding-in-r/</guid>
      <description>Every once in a while I complain on Twitter when I try to mix non-English letters with R. I am certainly not the first person to be frustrated by encoding issues, though I am (maybe too) hopeful that the problems won’t last for much longer. We live in the age of vacuum bots and 3D-printing, so what makes multi-language support so complicated?
Trying to mix Hebrew with #rstats is a bit of a nightmare, but at least it led me to this amazing &amp;quot;String encoding and R&amp;quot; blogpost by @kevin_ushey.</description>
    </item>
    
    <item>
      <title>Iterative testing with plumber</title>
      <link>https://irene.rbind.io/post/iterative-testing-plumber/</link>
      <pubDate>Sun, 23 Dec 2018 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/iterative-testing-plumber/</guid>
      <description>This summer, I fiddled around with plumber, an R package for creating your very own web API. I got my start with Jeff Allen’s webinar, “Plumbing APIs with plumber” (slides here). I later dug into the topic some more using the plumber bookdown, along with a lot of trial and error.
In this blogpost, I’ll highlight how I gradually improved on my plumber building/testing workflow and eventually automated my testing steps.</description>
    </item>
    
    <item>
      <title>A summer of puzzles at RStudio</title>
      <link>https://irene.rbind.io/post/summer-rstudio/</link>
      <pubDate>Fri, 05 Oct 2018 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/summer-rstudio/</guid>
      <description>This summer, I teamed up with Jenny Bryan to create a series of coding puzzles, which (fingers crossed!) will be released next spring. It was exciting to start a project from the ground up, growing and shaping it over the 10-ish weeks of the internship.
Project background The Advent of Code puzzles were a major source of inspiration for the project. I spent a fair amount of my winter holidays last year solving the Advent of Code in R.</description>
    </item>
    
    <item>
      <title>A Tale of Two Testing Environments</title>
      <link>https://irene.rbind.io/post/two-test-env/</link>
      <pubDate>Sat, 11 Aug 2018 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/two-test-env/</guid>
      <description>Background Lesson 1: check suggested packages Lesson 2: MODULARIZE + use vagrant scripts with caution Internet solutions Conclusion   Today marks the second time I’ve debugged the problem of tests that pass with devtools::test() but fail with devtools::check(). Since I’m now riding my debug-success high (and hope never to repeat this again), here is a blogpost.
Background There are some resources online (see below) to help with debugging this particular problem, but they are sparse and situation-specific.</description>
    </item>
    
    <item>
      <title>FUNctional programming tricks in httr</title>
      <link>https://irene.rbind.io/post/fun-prog-httr/</link>
      <pubDate>Thu, 26 Jul 2018 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/fun-prog-httr/</guid>
      <description>httr basics On with the tricks! Embrace the backtick The null-default operator %||% Check argument inputs with match.arg() switch() out your if-elses Strange bedfellows  tl;dr - do read the source code!   Over the past few months, I worked on several projects that involved accessing web API’s in R, which meant I spent a lot of time puzzling over the functions and code in the httr package.</description>
    </item>
    
    <item>
      <title>Rats to reefs</title>
      <link>https://irene.rbind.io/post/rats-to-reefs/</link>
      <pubDate>Sat, 14 Jul 2018 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/post/rats-to-reefs/</guid>
      <description>This week, I came across two news articles about a study in Nature led by Nick Graham that linked invasive rats on islands to coral reefs. I was intrigued by how the different authors (in this case, Ed Yong from the Atlantic and Victoria Gill from the BBC) reported on the study, and took it as a sign that I have should some fun with text analysis.
Rats on islands eat all the seabirds --&amp;gt; less guano --&amp;gt; less nitrogen flowing into the sea --&amp;gt; fewer fish in offshore coral reefs.</description>
    </item>
    
    <item>
      <title>Presentations</title>
      <link>https://irene.rbind.io/presentations/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>https://irene.rbind.io/presentations/</guid>
      <description>Data science presentations</description>
    </item>
    
  </channel>
</rss>