<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~files/feed.xsl"?>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedpress="https://feed.press/xmlns" xmlns:media="http://search.yahoo.com/mrss/" xmlns:podcast="https://podcastindex.org/namespace/1.0" version="2.0">
  <channel>
    <feedpress:locale/>
    <atom:link rel="hub" href="https://feedpress.superfeedr.com/"/>
    <title>Ben Johnston</title>
    <atom:link href="https://feedpress.me/Ben-Johnston" rel="self" type="application/rss+xml"/>
    <link>https://www.ben-johnston.co.uk/</link>
    <description/>
    <lastBuildDate>Mon, 27 Jan 2025 19:09:19 +0000</lastBuildDate>
    <language>en-GB</language>
    <sy:updatePeriod>
hourly</sy:updatePeriod>
    <sy:updateFrequency>
1</sy:updateFrequency>
    <image>
      <url>https://www.ben-johnston.co.uk/wp-content/uploads/2024/09/cropped-favicon-32x32.png</url>
      <title>Ben Johnston</title>
      <link>https://www.ben-johnston.co.uk/</link>
      <width>32</width>
      <height>32</height>
    </image>
    <item>
      <title>R for SEO Part 9: Web Scraping With R &amp; Rvest</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-9-web-scraping-with-r-rvest/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Mon, 27 Jan 2025 19:02:17 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">https://www.ben-johnston.co.uk/?p=3786</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-9-web-scraping-with-r-rvest/">R for SEO Part 9: Web Scraping With R &amp; Rvest</a></p>
<p>Hello, and welcome back. We’re (finally) in the home stretch of our R for SEO series with part nine, where we’re talking...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-9-web-scraping-with-r-rvest/">R for SEO Part 9: Web Scraping With R &amp; Rvest</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&#038;title=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-9-web-scraping-with-r-rvest/" data-a2a-title="R for SEO Part 9: Web Scraping With R &amp; Rvest"></a></p>
<p>Hello, and welcome back. We’re (finally) in the home stretch of our <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a> series with part nine, where we’re talking about scraping the web using R, particularly using the rvest package.</p>



<p>Today, we’re going to discuss the rvest package, look at the different scraping methods available, pull data from multiple pages and then look at how we can do so without bringing our target sites down. This is going to be fairly important for our final piece, so it’s worth paying attention today.</p>



<p>As always, this is a long piece, so feel free to use the table of contents below and do please sign up for my email list, where you’ll get updates of fresh content for free.</p>




<script>(function() {
window.mc4wp = window.mc4wp || {
listeners: [],
forms: {
on: function(evt, cb) {
window.mc4wp.listeners.push(
{
event   : evt,
callback: cb
}
);
}
}
}
})();
</script><!-- Mailchimp for WordPress v4.10.0 - https://wordpress.org/plugins/mailchimp-for-wp/ --><form id="mc4wp-form-1" class="mc4wp-form mc4wp-form-3535" method="post" data-id="3535" data-name="Signup Now" ><div class="mc4wp-form-fields"><p>
    <input type="email" name="EMAIL" placeholder="Your email address" required="">
</p>

<p>
<input type="submit" value="Sign up" />
</p></div><label style="display: none !important;">Leave this field empty if you&#8217;re human: <input type="text" name="_mc4wp_honeypot" value="" tabindex="-1" autocomplete="off" /></label><input type="hidden" name="_mc4wp_timestamp" value="1738006258" /><input type="hidden" name="_mc4wp_form_id" value="3535" /><input type="hidden" name="_mc4wp_form_element_id" value="mc4wp-form-1" /><div class="mc4wp-response"></div></form><!-- / Mailchimp for WordPress Plugin -->


<p></p>



<h2 class="wp-block-heading">The Rvest Package</h2>



<p>The <a href="https://rvest.tidyverse.org/" target="_blank">rvest </a>package for R – another <a href="https://hadley.nz/" target="_blank">Hadley Wickham</a> creation – is the most commonly used web scraping package for the R language, and it’s easy to see why. It brings a lot of the <a href="https://www.tidyverse.org/" target="_blank">Tidyverse’s </a>tidy data outputs and notation functionality to what can be a complex element of data analysis and SEO. I’m a fan.</p>



<p>It’s not included in the Tidyverse package, so you’ll need to install it separately. You can do that like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">install.packages("rvest")

library(rvest)</pre>



<p>This will install the rvest package and get it initialised. It’s also worth installing the Tidyverse as always.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">library(tidyverse)</pre>



<p>Now we have rvest installed, we can get scraping. But how do we find what to scrape from a page? That’s where we start needing to understand a bit about how scraping works and how to identify our data points.</p>



<h2 class="wp-block-heading">XPath &amp; CSS Selectors</h2>



<p>XPath and CSS selectors are the most widely used ways of identifying elements on a page, and are both crucial to understand when it comes to scraping the web.</p>



<p>Personally, since I’m a little bit more old-school, I tend to default to XPath, but that’s not to say CSS selectors aren’t brilliant – they certainly help you write much more efficient code.</p>



<p>Let’s look at the differences between the two.</p>



<h3 class="wp-block-heading">What Is XPath?</h3>



<p><a href="https://developer.mozilla.org/en-US/docs/Web/XPath" target="_blank">XPath</a> – or XML Path Language – is a syntax used for selecting nodes in a document. Think of it like a way to pinpoint specific parts of an XML tree structure, like finding particular branches or leaves. XPath has been around a long time and is very powerful, and rvest has great support for it.<br><br>There are a number of ways that you can find the relevant XPath query that you’ll need to scrape your chosen elements from your pages, and we’ll cover that very shortly.</p>



<h3 class="wp-block-heading">What Are CSS Selectors?</h3>



<p><a href="https://www.w3schools.com/css/css_selectors.asp" target="_blank">CSS selectors </a>are the visual language of the web, used to style and structure web pages, but they can do much more than just create pretty sites. In R, you can leverage CSS selectors to extract the almost any data you need from web pages.</p>



<p>Again, rvest has a lot of native support for CSS selectors, and they tend to be a lot more efficient to write queries with, as well as needing less debugging. Both methods are completely valid and, truthfully, while I tend to default to XPath, CSS selectors are finding their way into my work a lot more due to having to write less code. But what are the key differences?<br></p>



<h3 class="wp-block-heading">XPath Vs CSS Selectors</h3>



<p>Using Vs is a bit of a misnomer here – it’s not a fight, because they’re both worth using, but it’s worth understanding a little bit more about the differences between the two and when to use them.</p>



<p>In general terms – there are always exceptions – CSS selectors are:</p>



<ul class="wp-block-list">
<li><strong>Easier &amp; more efficient to write:</strong> A CSS selector is much shorter than an XPath query, and usually more intuitive, especially if you’ve been doing a lot of technical SEO work over your career</li>



<li><strong>Faster to run:</strong> Especially if you’re using browser-native CSS, which can make large-scale scraping projects much quicker</li>



<li><strong>Primarily designed for HTML pages:</strong> They’re not great at navigating up the DOM tree, and they’re not always so good for specific text elements</li>
</ul>



<p>Conversely, XPath tends to be:</p>



<ul class="wp-block-list">
<li><strong>More powerful:</strong> You can do very complex queries with XPath, which can allow you to navigate the DOM tree in both directions, as well as scraping elements from parts of the site that don’t have CSS attached – more on that shortly</li>



<li><strong>More complex:</strong> Due to the power, an XPath query will generally be longer and more complex to write and maintain – but don’t let that put you off</li>
</ul>



<p>So now we know the two key methods we’re going to be using to identify the elements we’ll be scraping today, let’s talk about how to find them.</p>



<h2 class="wp-block-heading">Finding CSS Selectors &amp; XPath Queries</h2>



<p>Obviously, it’s all well and good saying you want to use R to scrape a certain part of a page, such as the H1 or image alt text, but how do you actually find the elements you need? How do you identify the CSS selector without spending ages digging through the code and how do you go about putting an XPath query together?</p>



<p>Fortunately, there are a few very quick and easy ways to do that using Chrome. Let’s take a look at a couple of them.</p>



<h3 class="wp-block-heading">The SelectorGadget Chrome Extension</h3>



<p><a href="https://chromewebstore.google.com/detail/selectorgadget/mhjhnkcfbdhnjickkkdbjoemdmbfginb?hl=en" target="_blank">SelectorGadget</a> is a handy, free Chrome extension that I use a lot. It’s very quick and easy to use and can help you find your CSS selector or XPath at the click of a button. It’s not perfect – few things are – but I find it hits more than it misses.</p>



<p>Install it into your Chrome browser and then go to the page you want to scrape an element from. Let’s look at my <a href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/">last post</a> in this series.</p>



<p>Now let’s say we want to scrape the article title.</p>



<p>Click the SelectorGadget extension and hover over the article title like so:</p>



<figure class="wp-block-image size-full"><img fetchpriority="high" decoding="async" width="875" height="241" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173759.png" alt="SelectorGadget highlighting article title" class="wp-image-3789" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173759.png 875w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173759-300x83.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173759-150x41.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173759-768x212.png 768w" sizes="(max-width: 875px) 100vw, 875px" /></figure>



<p>You’ll see that it’s highlighted the selector.</p>



<p>If you look into the SelectorGadget bar and click the title, it’ll put the following in there:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="409" height="39" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173813.png" alt="SelectorGadget CSS selector highlight" class="wp-image-3790" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173813.png 409w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173813-300x29.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-173813-150x14.png 150w" sizes="(max-width: 409px) 100vw, 409px" /></figure>



<p>And there we go, we have our CSS selector.</p>



<p>Told you it was easy.</p>



<p>Now if you click on the XPath button to the right of the extension, like so:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="279" height="45" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174051.png" alt="XPath highlight in SelectorGadget" class="wp-image-3791" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174051.png 279w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174051-150x24.png 150w" sizes="(max-width: 279px) 100vw, 279px" /></figure>



<p>You’ll get a popup with the XPath query you need.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="425" height="186" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174104.png" alt="XPath popup from SelectorGadget" class="wp-image-3792" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174104.png 425w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174104-300x131.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174104-150x66.png 150w" sizes="(max-width: 425px) 100vw, 425px" /></figure>



<p>Again, these aren’t always perfect and sometimes won’t work the way they should, but they’ll give you a good starting point.</p>



<p>But what if, for some reason, you can’t install the extension, or you find it doesn’t work the way it should on certain sites? Fortunately, there’s another option that’s built right into Chrome and most other browsers.</p>



<h3 class="wp-block-heading">Finding CSS Selectors &amp; XPath Queries With Chrome Developer Tools</h3>



<p>I’m sure if you’ve been working in SEO for a while, you’ve become very familiar with Chrome or other browsers’ Developer Tools window. I don’t think I go a day without it, but did you know that you can also use it to find the CSS Selectors or create an XPath query based on a specific element? You probably did, but let’s talk through it anyway.</p>



<p>Personally, I tend to use this more when I’m trying to build a scraper for an area that either doesn’t have CSS to it, such as something in the head, or when something stops SelectorGadget working effectively, but it’s definitely worth learning how to use it to help you as you get more familiar with scraping in R.</p>



<p>Go back to the page we discussed in the last section. Now highlight a part of the article title and right-click. Now click “Inspect”, like so:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="622" height="697" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174134.png" alt="Using Chrome Developer Tools on an article title" class="wp-image-3793" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174134.png 622w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174134-268x300.png 268w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174134-134x150.png 134w" sizes="(max-width: 622px) 100vw, 622px" /></figure>



<p>This will bring up the familiar “Elements” window:</p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="250" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174151-1024x250.png" alt="Chrome Developer Tools highlights article title" class="wp-image-3794" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174151-1024x250.png 1024w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174151-300x73.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174151-150x37.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174151-768x188.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174151.png 1293w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Now if we right click on the element we care about – our article title, in this case, you’ll see the following dialogue. If you hover over “Copy”, it’ll pop out with a few very handy options:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="653" height="589" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174211.png" alt="Chrome Developer Tools find CSS Selector or XPath" class="wp-image-3795" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174211.png 653w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174211-300x271.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174211-150x135.png 150w" sizes="(max-width: 653px) 100vw, 653px" /></figure>



<p>You can copy your CSS selector or your XPath query directly from here, ready to paste in your code. I told you it was handy!</p>



<p>Again, it’s also not perfect, but between developer tools and SelectorGadget, I can pretty much always find the right element notation for whatever I’m trying to scrape.</p>



<p>I’m conscious that we’re getting scarily close to 1,500 words and we haven’t written a single line of actual code yet! Let’s fix that by using R to scrape the article title with the CSS selector.</p>



<h2 class="wp-block-heading">Scraping Article Titles With R &amp; CSS Selectors</h2>



<p>One of the things that is always worth remembering about using CSS selectors is that they will typically differ between websites, due to the fact that CSS styling tends to be individual to the site in question. That’s why tools like SelectorGadget or Chrome Dev Tools are so useful. Let’s write our first scraper using R’s rvest package to scrape the article title from my last post.</p>



<p>First, create an object of the URL to that post, like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">scrapeURL &lt;- "https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/"</pre>



<p>Now we want to find our CSS selector. Do that with SelectorGadget, and our selector will be as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="css" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">.entry-title</pre>



<p>Alright, we’re ready. Let’s create a very simple scraping call:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">articleTitle &lt;- read_html(scrapeURL) %>% html_element(".entry-title") %>% 
  html_text()</pre>



<p>Run that in your console, now type articleTitle and you should see the following output:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="439" height="48" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174245.png" alt="R scraping article title output" class="wp-image-3796" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174245.png 439w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174245-300x33.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174245-150x16.png 150w" sizes="(max-width: 439px) 100vw, 439px" /></figure>



<p>Easy, right? Let’s take a look at how it works.</p>



<h3 class="wp-block-heading">Our First Scraper Command</h3>



<p>Rvest was created by the same brain behind the tidyverse (you may have guessed that I’m a fan of Mr Wickham’s work by now), and you can see certain similarities in other commands that we’ve written throughout this series.</p>



<p>As always, let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>articleTitle &lt;-: </strong>We’re giving our object name the incredibly inventive name of “articleTitle”</li>



<li><strong>read_html(scrapeURL): </strong>Now we’re calling the read_html() function from rvest to download the html of our article that we put into scrapeURL</li>



<li><strong>%&gt;%: </strong>We’re using the <a href="https://style.tidyverse.org/pipes.html" target="_blank">tidyverse’s pipe parameter</a> to create multiple commands in one – you’ll have seen me use this throughout this series, but it’s always good to remind ourselves</li>



<li><strong>html_element(&#8220;.entry-title&#8221;):</strong> We’re telling rvest that we want to look specifically for the element (our CSS selector in this case) “.entry-title”</li>



<li><strong>%&gt;% html_text(): </strong>Finally, we’re chaining in the html_text() command to only show us the text from the element we’ve selected</li>
</ul>



<p>And there you have it – we’ve built our very first scraper. Now let’s look at how we can use XPath from R’s rvest package to scrape different elements.</p>



<h2 class="wp-block-heading">Scraping Meta Titles With R &amp; XPath</h2>



<p>Now let’s look at how we can use XPath with R and the rvest package to pull the meta title from the same page. This is a nice, simple command, but hopefully it’ll give you an idea of how the power of XPath can be added to your web scraping work.</p>



<p>As before, we’re going to use my last post as the target page, so our <code data-enlighter-language="r" class="EnlighterJSRAW">scrapeURL </code>object is still valid.</p>



<p>Since we’re going for the meta title in this part rather than something visible on the page, SelectorGadget isn’t going to help us. Fortunately, we know how to use Chrome Dev Tools to find this.</p>



<p>Go to your target URL and right click anywhere on the page to bring up the elements window. Now find and expand the &lt;head&gt; element.</p>



<p>We want to focus on the &lt;title&gt; element here, so find that and right click on it, like so:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="979" height="390" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174428.png" alt="Copying meta title XPath in Chrome Developer Tools" class="wp-image-3799" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174428.png 979w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174428-300x120.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174428-150x60.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174428-768x306.png 768w" sizes="(max-width: 979px) 100vw, 979px" /></figure>



<p>Select “Copy XPath” and paste it somewhere.</p>



<p>Now we want to use the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">articleMetaTitle &lt;- read_html(scrapeURL) %>% html_element(, "//title") %>%
  html_text()</pre>



<p>Not too dissimilar to our previous scraping command, is it? But there are a couple of key differences, which we’ll break down shortly.</p>



<p>After this runs, type <code data-enlighter-language="r" class="EnlighterJSRAW">articleMetaTitle </code>into your console, and you should see the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="674" height="40" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174449.png" alt="Article meta title scraped in R console" class="wp-image-3801" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174449.png 674w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174449-300x18.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174449-150x9.png 150w" sizes="(max-width: 674px) 100vw, 674px" /></figure>



<p>And there we have it – our meta title pulled into our R environment. As always, let’s investigate the command.</p>



<h3 class="wp-block-heading">Our Meta Title Scraping Command Broken Down</h3>



<p>As you can see, using XPath on elements isn’t too dissimilar to CSS selectors when we’re using R to scrape the web – albeit, I’ve used the simplest XPath command I could have for this example, but hopefully you’re starting to see how this can be used for your SEO work.</p>



<p>Let’s break this command down:</p>



<ul class="wp-block-list">
<li><strong>read_html(scrapeURL) %&gt;%:</strong> As before, we’ve created our articleMetaTitle object and used read_html on our scrapeURL page object and then used the pipe command to link our command up with the next</li>



<li><strong>html_element(, &#8220;//title&#8221;) %&gt;%: </strong>This is where the key difference between using XPath and CSS selectors comes in – the comma. The comma tells R that we’re not using a CSS selector, but rather XPath. In this case, we’re using a very basic XPath command to scrape the title element and then we’re chaining to our next command</li>



<li><strong>html_text():</strong> As before, we’re using html_text() to only show us the text of our scraped data</li>
</ul>



<p>This is quite easy, right? Building scrapers in R isn’t actually as complicated as it sounds, but obviously all we’ve done so far is pull one element at a time from a specific page.</p>



<p>Now let’s look at how we can pull multiple elements from a page.</p>



<h2 class="wp-block-heading">Scraping Titles, Descriptions &amp; H1s From A Page Using R</h2>



<p>Obviously, when we think about scraping the web, and using a programming language like R to do it, we’re thinking about scraping more than one thing at a time and getting them to a useful dataframe. So far, we’ve looked at using specific elements, but the point is scale – so now, let’s take a look at how we can scrape multiple elements from a page using R and rvest.</p>



<p>We’re going to build a function to scrape the meta title, meta description and the H1 from my last post. After that, we’ll look at scraping multiple elements from multiple pages.</p>



<p>Firstly, we need to get our elements. You can do that with a combination of Chrome Dev Tools and SelectorGadget, and with the power of regular expressions, we can end up with the following list of elements:</p>



<ul class="wp-block-list">
<li><strong>//title:</strong> The meta title XPath</li>



<li><strong>&#8220;meta[name=&#8217;description&#8217;]&#8221;), &#8220;content&#8221;:</strong> The attribute for the meta description</li>



<li><strong>.entry-title: </strong>The selector for gathering the article title, the H1</li>
</ul>



<p>Now we’ve got our list of elements, let’s get to work on our function. It’ll look a little something like this:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pageScrape &lt;- function(x){
  
  pageContent &lt;- read_html(x)
  
  metaTitle &lt;- html_element(pageContent,, "//title") %>% html_text()
  
  metaDescription &lt;- html_attr(html_element(pageContent, "meta[name='description']"), 
                               "content")
  
  heading &lt;- html_element(pageContent, ".entry-title") %>% html_text()
  
  output &lt;- data.frame(metaTitle, metaDescription, heading)
  
}</pre>



<p>You can run this function like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pageElements &lt;- pageScrape(scrapeURL)</pre>



<p>And it’ll give you the following output:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="1004" height="189" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174730.png" alt="Multiple elements from a page scraped with R" class="wp-image-3803" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174730.png 1004w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174730-300x56.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174730-150x28.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174730-768x145.png 768w" sizes="(max-width: 1004px) 100vw, 1004px" /></figure>



<p>Still pretty simple, right? As always, let’s break it down.</p>



<h3 class="wp-block-heading">Our Multiple-Element Scraper Broken Down</h3>



<p>As you can see, we’ve used a function to pull some different elements from the page using the various nodes we’ve identified, and it’s given us some useful SEO data. Let’s dig in to how this function works.</p>



<ul class="wp-block-list">
<li><strong>pageScrape &lt;- function(x){: </strong>We’re creating our function called pageScrape with the x variable</li>



<li><strong>pageContent &lt;- read_html(x):</strong> Here, we’re getting the html of our target page into our environment using rvests read_html() function</li>



<li><strong>metaTitle &lt;- html_element(pageContent,, &#8220;//title&#8221;) %&gt;% html_text():</strong> As we saw with our earlier meta title scraping command, we’re using XPath to pull the title with “//title” and using html_text() to just get the text</li>



<li><strong>metaDescription &lt;- html_attr(html_element(pageContent, &#8220;meta[name=&#8217;description&#8217;]&#8221;), &#8220;content&#8221;):</strong> Meta descriptions can often be one of the most annoying parts to scrape, and they sometimes differ between sites. In this case, we’re using html_attr() instead of html_text() because the meta description is stored within the content attribute</li>



<li><strong>heading &lt;- html_element(pageContent, &#8220;.entry-title&#8221;) %&gt;% html_text():</strong> We’re re-using our H1 scraper from earlier, pulling the H1 using the .entry-title CSS selector</li>



<li><strong>output &lt;- data.frame(metaTitle, metaDescription, heading):</strong> Finally, we’re creating our output dataframe that puts the meta title, meta description and H1 into separate columns</li>
</ul>



<p>Now let’s think about how we can run this across multiple pages.</p>



<h2 class="wp-block-heading">Applying R Scrapers To Multiple Pages</h2>



<p>If you’ve read my previous posts on using <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/"   title="loops" data-wpil-keyword-link="linked"  data-wpil-monitor-id="240">loops</a> and apply methods in R, you’ll have an idea of how we can run this across multiple pages.</p>



<p>For consistency’s sake, let’s run it across all the pages on my site, since the elements will all be the same.</p>



<p>Firstly, we want to get all of our target URLs into an object. The easiest way to do this is to scrape my pages sitemap.</p>



<p>We can do that like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pagesSitemap &lt;- read_html("https://www.ben-johnston.co.uk/page-sitemap.xml") %>% 
  html_elements(, "//loc") %>% html_text()</pre>



<p>Now if we use the <a href="https://www.rdocumentation.org/packages/purrr/versions/0.2.4/topics/reduce" target="_blank">reduce() </a>function from the tidyverse as we did in part 8, we can scrape all of the elements we discussed previously from all of my pages like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pagesElements &lt;- reduce(lapply(pagesSitemap, pageScrape), bind_rows)</pre>



<p>I won’t break this particular one down, as it’s covered in depth in part 8.</p>



<p>You’ll see a few NAs in there because featured images are included in the sitemap, but you get the idea. For a more robust method, you could subset using the methods we learned all the way back in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">part 1</a>, like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pagesSitemap &lt;- subset(pagesSitemap, str_detect(pagesSitemap, "wp-content") == FALSE)</pre>



<p>Now we’ll only have the html pages. If we run our scraper again, we’ll get the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="990" height="314" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174858.png" alt="" class="wp-image-3806" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174858.png 990w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174858-300x95.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174858-150x48.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174858-768x244.png 768w" sizes="(max-width: 990px) 100vw, 990px" /></figure>



<p>So now we know how to scrape multiple elements from multiple pages, let’s talk about how we do that politely.</p>



<h2 class="wp-block-heading">Using The Polite R Package To Reduce Scrape Load</h2>



<p>Scraping websites isn’t always the most popular thing with site owners. Sometimes they don’t want their content being used that way (I personally block ChatGPT for precisely that reason), and also scraping multiple pages can put a large amount of load on a server, sometimes costing them money or even making them think that their site is under attack.</p>



<p>The <a href="https://dmi3kno.github.io/polite/" target="_blank">Polite package for R</a> ensures that your scraper respects robots.txt and also helps you reduce the amount of load you’re putting on a server. I’m giving you the techniques to scrape websites with R here, but it would be remiss of me not to tell you that you should do so politely and respect the owners of the sites.</p>



<p>Lecture over, let’s get the Polite package installed.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">install.packages("polite")

library(polite)</pre>



<h3 class="wp-block-heading">Using The Polite R Package’s Bow Function</h3>



<p>The Polite package has two key functions: <code data-enlighter-language="r" class="EnlighterJSRAW">bow</code> and <code data-enlighter-language="r" class="EnlighterJSRAW">scrape</code>. It is <em>very</em> polite. Bow introduces your R environment to the server and asks permission to scrape, looking at the robots.txt file and scrape runs the scraper.</p>



<p>The three tenets of a polite R scraping session are defined by the authors as “seeking permission, taking slowly and never asking twice” and that’s as good a definition of scraping the web as we’ll find.</p>



<p>Let’s introduce ourselves to the target site using bow:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">session &lt;- bow("https://www.ben-johnston.co.uk")</pre>



<p>If we inspect our session object now, we’ll see the following:<br><br></p>



<figure class="wp-block-image size-full"><img decoding="async" width="485" height="107" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174922.png" alt="A polite R scraping session from the R console" class="wp-image-3808" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174922.png 485w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174922-300x66.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174922-150x33.png 150w" sizes="(max-width: 485px) 100vw, 485px" /></figure>



<p>This has introduced us to the website and checked the robots.txt. Now let’s update our previous multiple page scraper to do so politely.</p>



<h3 class="wp-block-heading">Using The Polite R Package’s Scrape Function On Multiple URLs</h3>



<p>Now we understand about scraping politely, let’s update our previous <code data-enlighter-language="r" class="EnlighterJSRAW">pageScrape</code> function to scrape our multiple URLs politely.</p>



<p>This isn’t overly complicated and our updated function is as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">politeScrape &lt;- function(x){
  
  session &lt;- bow(x)
  
  pageContent &lt;- scrape(session)
  
  metaTitle &lt;- html_element(pageContent,, "//title") %>% html_text()
  
  metaDescription &lt;- html_attr(html_element(pageContent, "meta[name='description']"), 
                               "content")
  
  heading &lt;- html_element(pageContent, ".entry-title") %>% 
    html_text()
  
  output &lt;- data.frame(metaTitle, metaDescription, heading)
  
}</pre>



<p>It’s similar to our previous scraper function, isn’t it? But there’s one key difference: rather than using <code data-enlighter-language="r" class="EnlighterJSRAW">read_html()</code>, we’re using the polite package’s <code data-enlighter-language="r" class="EnlighterJSRAW">scrape() </code>function and creating a new bow() for every page, ensuring that we are introducing ourselves to each page and making sure that we’re respecting robots.txt and not hitting the site too hard.</p>



<p>We can run it like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">politeElements &lt;- reduce(lapply(pagesSitemap, politeScrape), bind_rows)</pre>



<p>And if we inspect our politeElements object in the console, we should see the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="999" height="318" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174953.png" alt="Output of a polite R scraping session on multiple pages" class="wp-image-3809" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174953.png 999w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174953-300x95.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174953-150x48.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/Screenshot-2025-01-27-174953-768x244.png 768w" sizes="(max-width: 999px) 100vw, 999px" /></figure>



<p>So there we have it, that’s how you can use R’s rvest and polite packages to scrape multiple pages while being a good internet user.</p>



<h2 class="wp-block-heading">Wrapping Up</h2>



<p>This was quite a long piece and we’re reaching the end of my <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/category/r/r-seo/" title="R for SEO" data-wpil-keyword-link="linked" data-wpil-monitor-id="239">R for SEO</a> series (although there will definitely be a couple of bonus entries). I hope you’ve enjoyed today’s article on using R to scrape the web, you’ve learned to do so politely and you’re seeing some applications for using this in your SEO work.</p>



<p>Until next time, where I’ll be talking about how we can build a reporting dashboard with R, Google Sheets and Google Looker Studio, using everything we’ve learned throughout this series.</p>



<h3 class="wp-block-heading">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Install Packages

install.packages("rvest")

library(rvest)

library(tidyverse)

# Scrape Article Title With CSS Selector

scrapeURL &lt;- "https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/"

articleTitle &lt;- read_html(scrapeURL) %>% html_element(".entry-title") %>% 
  html_text()

# Scrape Meta Titles With XPath

articleMetaTitle &lt;- read_html(scrapeURL) %>% html_element(, "//title") %>%
  html_text()

#Scrape Multiple Elements

pageScrape &lt;- function(x){
  
  pageContent &lt;- read_html(x)
  
  metaTitle &lt;- html_element(pageContent,, "//title") %>% html_text()
  
  metaDescription &lt;- html_attr(html_element(pageContent, "meta[name='description']"), 
                               "content")
  
  heading &lt;- html_element(pageContent, ".entry-title") %>% html_text()
  
  output &lt;- data.frame(metaTitle, metaDescription, heading)
  
}

pageElements &lt;- pageScrape(scrapeURL)

# Scrape Multiple Pages

pagesSitemap &lt;- read_html("https://www.ben-johnston.co.uk/page-sitemap.xml") %>% 
  html_elements(, "//loc") %>% html_text()

pagesSitemap &lt;- subset(pagesSitemap, str_detect(pagesSitemap, "wp-content") == FALSE)

pagesElements &lt;- reduce(lapply(pagesSitemap, pageScrape), bind_rows)

# Scraping Politely

install.packages("polite")

library(polite)

session &lt;- bow("https://www.ben-johnston.co.uk")

politeScrape &lt;- function(x){
  
  session &lt;- bow(x)
  
  pageContent &lt;- scrape(session)
  
  metaTitle &lt;- html_element(pageContent,, "//title") %>% html_text()
  
  metaDescription &lt;- html_attr(html_element(pageContent, "meta[name='description']"), 
                               "content")
  
  heading &lt;- html_element(pageContent, ".entry-title") %>% 
    html_text()
  
  output &lt;- data.frame(metaTitle, metaDescription, heading)
  
}

politeElements &lt;- reduce(lapply(pagesSitemap, politeScrape), bind_rows)</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&amp;linkname=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-9-web-scraping-with-r-rvest%2F&#038;title=R%20for%20SEO%20Part%209%3A%20Web%20Scraping%20With%20R%20%26%20Rvest" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-9-web-scraping-with-r-rvest/" data-a2a-title="R for SEO Part 9: Web Scraping With R &amp; Rvest"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/r-for-seo-part-8-apply-methods.png" class="attachment-480x480 size-480x480 wp-post-image" alt="r for seo part 8 apply methods" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/r-for-seo-part-8-apply-methods.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/r-for-seo-part-8-apply-methods-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/r-for-seo-part-8-apply-methods-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2025/01/r-for-seo-part-8-apply-methods-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 8: Apply Methods in R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 7: Loops" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 7: Loops</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6.png" class="attachment-480x480 size-480x480 wp-post-image" alt="r for seo part 6" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 6: Using APIs In R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 8: Apply Methods in R</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Mon, 20 Jan 2025 09:17:19 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">https://www.ben-johnston.co.uk/?p=3742</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/">R For SEO Part 8: Apply Methods in R</a></p>
<p>Welcome back again. Now we’re at part eight of my series on R for SEO and we’ve been on quite a journey...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/">R For SEO Part 8: Apply Methods in R</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&#038;title=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/" data-a2a-title="R For SEO Part 8: Apply Methods in R"></a></p>
<p>Welcome back again. Now we’re at part eight of my series on <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a> and we’ve been on quite a journey so far, haven’t we? What was originally meant to take me eight weeks has taken several years! Between a pandemic, several jobs, two house-moves, some big shakeups in my personal life and a lot of changes in the marketing ecosystem, it’s been a <em>lot</em>. But today, we’re in the home stretch and we’ll be talking about using the apply family in R.</p>



<p>Applys are a more “R-centric” way of running functions across a range of data. They’re not entirely dissimilar to <a href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/">loops</a>, but tend to be a more efficient way to write them and they do come with additional functionality which can be very useful in a number of situations.</p>



<p>You’ll have seen some of them being used already through this series, particularly when we talked about using <a href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/">APIs in R</a>. So let’s take a look at the components of the apply family and what they can be used for.</p>



<p>As always, this is quite a long piece, so do feel free to skip around using the table of contents below and please do sign up for my email list to keep up to date when I publish new content.</p>




<script>(function() {
window.mc4wp = window.mc4wp || {
listeners: [],
forms: {
on: function(evt, cb) {
window.mc4wp.listeners.push(
{
event   : evt,
callback: cb
}
);
}
}
}
})();
</script><!-- Mailchimp for WordPress v4.10.0 - https://wordpress.org/plugins/mailchimp-for-wp/ --><form id="mc4wp-form-2" class="mc4wp-form mc4wp-form-3535" method="post" data-id="3535" data-name="Signup Now" ><div class="mc4wp-form-fields"><p>
    <input type="email" name="EMAIL" placeholder="Your email address" required="">
</p>

<p>
<input type="submit" value="Sign up" />
</p></div><label style="display: none !important;">Leave this field empty if you&#8217;re human: <input type="text" name="_mc4wp_honeypot" value="" tabindex="-1" autocomplete="off" /></label><input type="hidden" name="_mc4wp_timestamp" value="1738006258" /><input type="hidden" name="_mc4wp_form_id" value="3535" /><input type="hidden" name="_mc4wp_form_element_id" value="mc4wp-form-2" /><div class="mc4wp-response"></div></form><!-- / Mailchimp for WordPress Plugin -->


<p></p>



<h2 class="wp-block-heading">What Are Apply Commands In R?</h2>



<p>R’s apply family essentially allows you to apply a command or function across a range of data. The clue’s in the name, right? While not dissimilar to a loop in concept, there are a number of differences that do often make them a better or simpler choice.</p>



<p>As with loops, there are a few different variations that we can use. Let’s look at what they are before we get started on how we can use these different R apply methods in SEO work.</p>



<h3 class="wp-block-heading">The Different Kinds Of Apply Methods In R</h3>



<p>There are four key applies in the apply family</p>



<ul class="wp-block-list">
<li><strong>Apply:</strong> The most basic one. Apply runs across a column in a matrix or dataframe. I generally use this when I want to find the maximum or minimum value of a column or if I’m trying to find the number of times a term is mentioned in a text corpus</li>



<li><strong>Lapply:</strong> You’ll have seen me use this a couple of times throughout this series and, truthfully, it’s the one I use the most. Lapply applies a function across a list or vector and is great for using functions and API calls</li>



<li><strong>Sapply:</strong> Similar to lapply, and another one I use quite a bit. Sapply means “simple apply” and tries to simplify the output compared to lapply</li>



<li><strong>Mapply: </strong>Mapply means “map apply” and is great when you need to use your apply method across multiple datasets or elements</li>
</ul>



<p>So now we know what they are, let’s look at how to <em>apply</em> (Ha! I’m a comedic genius!) these to SEO work.</p>



<h2 class="wp-block-heading">Using R’s Apply Method To Count Keywords On Google Search Console Data</h2>



<p>The easiest way to show the difference between loops and applys is probably to replicate our loops from the last piece using the different apply methods.</p>



<p>To use the <a href="https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/apply" target="_blank">basic R apply method</a>, let’s rework what we did with our first for loop to count the number of keywords that have 20 or more impressions from our <a data-wpil-monitor-id="222" href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">Google Search Console</a> dataset – you can follow the steps from <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">part 2</a> to get that data, or just export from Google Search Console and import it using the read.csv function from <a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">part 1</a>.</p>



<p>Assuming you’ve named your dataframe gsc, use the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kwCount &lt;- sum(apply(gsc["Impressions"], 1, function(x) x >= 20))</pre>



<p>Now if we investigate our kwCount object in the console, using</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kwCount</pre>



<p>We should see the following – your data will be different, and probably much higher than mine.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="99" height="41" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-164822.png" alt="kwCount in R" class="wp-image-3399"/></figure>



<p>But we did that in one line, compared to the multiple lines of the loop we’d previously used, didn’t we?</p>



<p>As always, let’s break it down.</p>



<h3 class="wp-block-heading">Our Apply Command Broken Down</h3>



<p>Our apply command did what the loop achieved in one line rather than five. That’s interesting. Let’s see how it works.</p>



<ul class="wp-block-list">
<li><strong>kwCount &lt;-:</strong> As with all the commands we’ve used throughout the series, we’ve named our object and used &lt;- to tell R we want to keep it and use this particular name</li>



<li><strong>sum(apply:</strong> We’re invoking the sum command that we’re well familiar with, and then we use apply to apply it to each row or column in our dataset</li>



<li><strong>gsc[&#8220;Impressions&#8221;],:</strong> This tells R that we want to use this on our gsc dataset and that we want to focus on the Impressions column</li>



<li><strong>1:</strong> The number 1 specifies that we want to use this on rows. If we wanted it to be across columns, we’d use 2</li>



<li><strong>function(x) x &gt;= 20)):</strong> Finally, we’re using a very simple function that we’re applying across the rows to see if the value is greater than or equal to 20</li>
</ul>



<p>Pretty handy, right? It gives the same output as our first loop from part 7, but a lot quicker and more efficiently.</p>



<p>Now let’s take a look at lapply.</p>



<h2 class="wp-block-heading">Using R’s Lapply On A List or Dataframe</h2>



<p><a href="https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/lapply" target="_blank">Lapply</a> is one of R’s more popular apply methods and one that I use much more heavily than the regular apply command. Lapply runs across a list or vector and is particularly good for using across multiple API calls or URLs, which is naturally, very handy for SEO analysis.</p>



<p>Let’s take a look at how we can use lapply to replicate our “list” loop from part 7, where we use it to subset our Google Search Console dataset to show data for keywords with 20 or more impressions.</p>



<p>Again, obviously, we’d use subset from part 1 for this most of the time, but it’s a good first example for using lapply.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kw20 &lt;- reduce(lapply(seq_along(gsc$Impressions), function(x) if (gsc$Impressions[x] >= 20) gsc[x, ] else NULL), bind_rows)</pre>



<p>Here, we’re using the reduce function from the tidyverse to cover do.call(rbind in less code.</p>



<p>This will give you a dataframe called kw20 which only includes keywords, impressions, clicks and CTR where the impressions are 20 or more.</p>



<p>Run that in your console and now, as always, let’s break it down.</p>



<h3 class="wp-block-heading">Our Lapply Command Broken Down</h3>



<p>Again, we’ve replicated our loop in a smaller amount of code using an apply method – lapply in this case – than we used with the loop.</p>



<p>Hopefully you’re starting to see the power of the apply family for writing less code but getting the same results. We’ve trimmed it down even further by using the Tidyverse’s reduce function. You’re probably seeing why I never start R without the Tidyverse.</p>



<p>Let’s take a look at how it works:</p>



<ul class="wp-block-list">
<li><strong>kw20 &lt;- reduce(lapply(:</strong> We’re creating an object called kw20 and invoking the Tidyverse’s reduce command before calling lapply. Reduce allows us to bind the output for each row into our output</li>



<li><strong>seq_along(gsc$Impressions:</strong> As with our previous loop, we’re using seq_along on the Impressions column of our gsc dataset to apply our function across every row</li>



<li><strong>function(x):</strong> We’re creating our very simple function with the variable x, similar to how we’ve done all the way through this series</li>



<li><strong>if (gsc$Impressions[x] &gt;= 20) gsc[x, ] else NULL),:</strong> If you cast your mind back to part 5 where we talked about if statements in R, this won’t be too unfamiliar. Our function is a very simple if statement, where we’re seeing if our Impressions value in the specific row of the apply is greater than or equal to 20. If it’s not, the else is returning NULL, which means nothing</li>



<li><strong>bind_rows):</strong> Finally, we’re using dplyr from the Tidyverse’s bind_rows command to add the column values to the next row in our output dataset</li>
</ul>



<p>So there we have it – a really simple lapply command to run a simple function across multiple rows of a dataset.</p>



<p>Now let’s look at sapply and how it’s similar, but also a little different.</p>



<h2 class="wp-block-heading">Using R’s Sapply On A List or Dataframe</h2>



<p><a href="https://r-coder.com/sapply-function-r/" target="_blank">Sapply</a> – meaning “Simple apply” can work very similarly to lapply in a lot of instances, but it is focused on simplifying the output into a dataframe or vector, rather than a list. I find that when I use reduce and bind_rows from the Tidyverse, I get better results from lapply than sapply.</p>



<p>Either way, let’s run through how we can do the same command that we just did on lapply using sapply instead.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kw20 &lt;- reduce(sapply(seq_along(gsc$Impressions), 
                                function(x) if (gsc$Impressions[x] >= 20) gsc[x, ], 
                                simplify = FALSE), bind_rows)</pre>



<p>If this runs the way it should, it’ll give you the same output in kw20 as the lapply command did, but there are some differences in the command that we want to look at.</p>



<h3 class="wp-block-heading">Our Sapply Command</h3>



<p>As you’ll see from looking at the code, the command we’ve used here is pretty much exactly the same as out lapply command, with one key difference – we have the following parameter in there:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">simplify = FALSE</pre>



<p>Since the core tenet of sapply is to “simply” apply, it tries to simplify outputs to a dataframe. Since we’re already working with a dataframe, we can use simplify = FALSE here, as we don’t need it.</p>



<p>However, if you’re trying to coerce a list or matrix into something a bit easier to work with, sapply is a great choice. I find myself mostly using it when I’m working across JSON outputs, but for this example, you can see that it works quite nicely in the same way our lapply method did.</p>



<p>I’m sure you’re seeing plenty of ways that R’s apply family can be used for SEO by now. Let’s take a look at the most complex of the apply methods now – the powerful mapply().</p>



<h2 class="wp-block-heading">Using R’s Mapply Method</h2>



<p><a href="https://www.statology.org/r-mapply/" target="_blank">Mapply</a> allows you to run apply commands across multiple dataframe elements or vectors and is really handy when you need to do this. It’s a very powerful function and allows you to do some very complex analysis work with very little code. Let’s use a very simple example to identify a difference in our click through rates against a target, using our Google Search Console data.</p>



<p>Firstly, if you’ve imported your Google Search Console dataset from a CSV export rather than using the API, you’ll need to do a little preparation to remove the percentage character, set the column as a number and divide it by 100. This simple function below will do that:<br></p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ctrCleanup &lt;- function(x){
  
  x &lt;- gsub("%", "", x)
  
  x &lt;- as.numeric(x) 
  
  x &lt;- x/100
  
}</pre>



<p>You’ll have seen all of this throughout the series, but let’s break it down anyway:</p>



<ul class="wp-block-list">
<li><strong>ctrCleanup &lt;- function(x){:</strong> Our function is called ctrCleanup (catchy, right?) and has an x variable</li>



<li><strong>x &lt;- gsub(&#8220;%&#8221;, &#8220;&#8221;, x):</strong> On x, (our CTR column), we’re running gsub to find and replace the % character with nothing. Removing it, effectively</li>



<li><strong>x &lt;- as.numeric(x):</strong> Now we’re setting that column as a number rather than the character it was previously since we’ve removed the % character</li>



<li><strong>x &lt;- x/100:</strong> Finally, we’re dividing it by 100 so we can get our percentage as a decimal</li>
</ul>



<p>Run this in your console with the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gsc$CTR &lt;- ctrCleanup(gsc$CTR)</pre>



<p>and you’ll have your CTR column as decimals, ready to work with our mapply command.</p>



<p>Now we want to define our target CTR. Let’s take 5%.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">targetCTR &lt;- 0.05</pre>



<p>Now we’re prepared, let’s run our mapply command.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gsc$ctrDiff &lt;- mapply(function(x, y) abs(x - y), gsc$CTR, targetCTR)</pre>



<p>This will create a new column in our dataset called ctrDiff with the difference between our actual click through rate and our target.</p>



<h3 class="wp-block-heading">Our Mapply Command Broken Down</h3>



<p>As always, let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>gsc$ctrDiff &lt;- mapply(:</strong> We’re creating a new column called ctrDiff in our gsc dataframe and using mapply to run our function</li>



<li><strong>function(x, y) abs(x &#8211; y): </strong>Our function uses x and y parameters (our two data elements) and uses abs to find the absolute difference between them</li>



<li><strong>gsc$CTR, targetCTR): </strong>Finally, we’re defining our x and y variables – our actual click through rate from Google Search Console and our target CTR</li>
</ul>



<p>Running this will create a new column in gsc and will give you the difference by query from your target CTR. Hopefully it’ll give you some idea of where you need to work on with your SEO efforts. Or maybe not, but at least it gives you an idea of how mapply works!</p>



<p>So that’s the basics of how the different apply methods in R can work. There’s a common thread to the anatomy of them, isn’t there?</p>



<p>Let’s look at that now.</p>



<h2 class="wp-block-heading">The Anatomy of R’s Apply Methods</h2>



<p>In general, the apply family has a common thread of how they work – a common anatomy. You’ll find yourself working around this as you go further in your R journey, but you’ll generally look at it like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">dataframe &lt;- applyMethod(data, function, extra parameters)</pre>



<p>It seems a little backwards, doesn’t it? Usually we call our function before the data. Truthfully, I don’t fully know why it’s done this way, but I suspect it’s to do with the fact that we’re defining the data to apply the function to before we start actually applying it. All programming languages have these fun little elements to them, and I think this is one of R’s.</p>



<p>I think this might actually be the shortest of the posts I’ve written in this series, but don’t let that be an indication of the power of R’s apply family – they are absolutely vital in R programming and an essential element in using <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/category/r/r-seo/" title="R for SEO" data-wpil-keyword-link="linked" data-wpil-monitor-id="219">R for SEO</a>.</p>



<p>Until next time, where we’ll be talking about web scraping in R.</p>



<h3 class="wp-block-heading">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Install Packages

install.packages("tidyverse")

library(tidyverse)

# Read In Data

gsc &lt;- read.csv("Queries.csv", stringsAsFactors = FALSE)

# Apply Methods

## Apply()

kwCount &lt;- sum(apply(gsc["Impressions"], 1, function(x) x >= 20))

## Lapply()

kw20 &lt;- reduce(lapply(seq_along(gsc$Impressions), function(x) if 
                      (gsc$Impressions[x] >= 20) gsc[x, ] else NULL), bind_rows)

## Sapply()

kw20 &lt;- reduce(sapply(seq_along(gsc$Impressions), 
                                function(x) if (gsc$Impressions[x] >= 20) gsc[x, ], 
                                simplify = FALSE), bind_rows)

## Mapply

ctrCleanup &lt;- function(x){
  
  x &lt;- gsub("%", "", x)
  
  x &lt;- as.numeric(x) 
  
  x &lt;- x/100
  
}

gsc$CTR &lt;- ctrCleanup(gsc$CTR)

targetCTR &lt;- 0.05 

gsc$ctrDiff &lt;- mapply(function(x, y) abs(x - y), gsc$CTR, targetCTR)</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&amp;linkname=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-8-apply-methods-in-r%2F&#038;title=R%20For%20SEO%20Part%208%3A%20Apply%20Methods%20in%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/" data-a2a-title="R For SEO Part 8: Apply Methods in R"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO part 4: functions" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 4: Functions</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 7: Loops" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/r-for-seo-part-7-loops-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 7: Loops</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6.png" class="attachment-480x480 size-480x480 wp-post-image" alt="r for seo part 6" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 6: Using APIs In R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 7: Loops</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Mon, 19 Aug 2024 17:02:47 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">https://www.ben-johnston.co.uk/?p=3397</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/">R For SEO Part 7: Loops</a></p>
<p>Welcome back to my R for SEO series. We’re in the home stretch now, with part seven. Today, we’re going to be...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/">R For SEO Part 7: Loops</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&#038;title=R%20For%20SEO%20Part%207%3A%20Loops" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/" data-a2a-title="R For SEO Part 7: Loops"></a></p>
<p>Welcome back to my <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a> series. We’re in the home stretch now, with part seven. Today, we’re going to be looking at different ways that we can run functions or commands over a series of elements using the various kinds of loops that exist in R.</p>



<p>If you’ve followed along so far, or you’ve tried some experimentation of your own, you’ve probably encountered loops and applys along the way. I know early on in my R journey, it very much seemed like pot luck as to which <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/"   title="apply" data-wpil-keyword-link="linked"  data-wpil-monitor-id="230">apply</a> I should use, or whether a loop was easier, so hopefully today’s piece will start to clear that up for you a little.</p>



<p>I know that most programming courses cover these elements earlier, but for me, it really didn’t click until I&#8217;d learned more about the other areas we’ve covered in this series, so that’s why I&#8217;ve placed it here.</p>



<p>As always, if you’ve found this useful, please give it a share on your social networks and please sign up to my free email updates to be alerted when I drop my next article.</p>




<script>(function() {
window.mc4wp = window.mc4wp || {
listeners: [],
forms: {
on: function(evt, cb) {
window.mc4wp.listeners.push(
{
event   : evt,
callback: cb
}
);
}
}
}
})();
</script><!-- Mailchimp for WordPress v4.10.0 - https://wordpress.org/plugins/mailchimp-for-wp/ --><form id="mc4wp-form-3" class="mc4wp-form mc4wp-form-3535" method="post" data-id="3535" data-name="Signup Now" ><div class="mc4wp-form-fields"><p>
    <input type="email" name="EMAIL" placeholder="Your email address" required="">
</p>

<p>
<input type="submit" value="Sign up" />
</p></div><label style="display: none !important;">Leave this field empty if you&#8217;re human: <input type="text" name="_mc4wp_honeypot" value="" tabindex="-1" autocomplete="off" /></label><input type="hidden" name="_mc4wp_timestamp" value="1738006258" /><input type="hidden" name="_mc4wp_form_id" value="3535" /><input type="hidden" name="_mc4wp_form_element_id" value="mc4wp-form-3" /><div class="mc4wp-response"></div></form><!-- / Mailchimp for WordPress Plugin -->


<h2 class="wp-block-heading">What Is A Loop?</h2>



<p>A loop in R is more or less what it sounds like – a command that keeps running some code until a certain condition stops it.</p>



<p>There are two main types that we’ll look at today: the for loop and the while loop.</p>



<p>Before we jump into how they work, let’s look at what the two different loops do and are used for.</p>



<ul class="wp-block-list">
<li><strong>The for loop:</strong> The for loop is the most commonly used one in R and works great if you have a defined vector or dataset that you want your commands to run over, or you know how many times you want to run it</li>



<li><strong>The while loop:</strong> The while loop keeps running as long as a certain condition is met, whether it is a certain value, loop length or even timeframe. They’re very useful</li>
</ul>



<h2 class="wp-block-heading">The For Loop In R</h2>



<p>If you’re familiar with other programming languages like Python, the humble <a href="https://www.w3schools.com/r/r_for_loop.asp" target="_blank">for loop</a> will be something in your arsenal, and they’re no less powerful in R.</p>



<p>Let’s put a really simple for loop together below, running through our Google Search Console data from the last few pieces (the tutorial is in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">part 2</a>). We’re going to use this loop to count the number of keywords which have 20 or more impressions.</p>



<p>First we want to create an object called kwCount and we’re going to set its value to zero, like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kwCount &lt;- 0</pre>



<p>Now to create our loop:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for (val in gsc$Impressions){
  if(val >= 20)  kwCount = kwCount+1
}</pre>



<p>OK, let’s break it down.</p>



<h3 class="wp-block-heading">The For Loop In R Explained</h3>



<p>There is a lot that you can do with loops, and this is only a really basic example, but they all follow the same general process.</p>



<ul class="wp-block-list">
<li><strong>for (val in gsc$Impressions){:</strong> We’re starting our loop with “for” and saying that for every value in gsc$Impressions, to run the command within our braces. There are a number of different ways that loops can be used, and I’ll show a couple of more along the way, but the anatomy is always similar</li>



<li><strong>if(val &gt;= 20)&nbsp; kwCount = kwCount+1}:</strong> As we saw in our <a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/">R if statements</a> tutorial, our braces incorporate our commands. In this simple example, we’re using a small if statement, saying that if the value is greater than or equal to 20, to add it to our kwCount vector as a numerical value of plus 1</li>
</ul>



<p>So as you can see, this is a very simple for loop using R. As in our <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" data-wpil-monitor-id="204">R functions</a> piece, we don’t strictly <em>need</em> a loop for this, but it’s a simple way to show you the anatomy.</p>



<p>If you now type kwCount in your console, you’ll see the total number of your Google Search Console queries that are greater than or equal to 20 impressions. In my case, you get the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="99" height="41" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-164822.png" alt="Output of an R for loop" class="wp-image-3399"/></figure>



<h3 class="wp-block-heading">Other For Loop Methods</h3>



<p>In our previous example, we used a simple <em>val in</em> method for our for loop, but there are many others.</p>



<p>Personally, I find myself using loops on lists or vectors a lot more, so I find myself using some alternatives along the way. Let’s take a look at some of them:</p>



<h3 class="wp-block-heading">Using For Loops On A List</h3>



<p>Let’s use a for loop to create a subset of our Google Search Console that only incorporates rows which have 20 or more impressions. Again, a loop is overkill here, but it’s a good example.</p>



<p>First, we want to create our dataframe to host our data. Let’s call that kw20.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kw20 &lt;- data.frame()</pre>



<p>Now our for loop is as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for (i in seq_along(gsc)) {
  if (gsc$Impressions[i] >= 20) {
    kw20 &lt;- rbind(kw20, gsc[i, ])
  }
}</pre>



<p>Fairly self-explanatory, isn’t it? But, as always, let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>for (i in seq_along(gsc)){:</strong> As before, we’re invoking our for loop, but there’s a key difference here. We’re using “i in seq_along(gsc)”, which means “for this value (i) in the sequence along our list of our gsc object, we want to do what is in our braces”</li>



<li><strong>if (gsc$Impressions[i] &gt;= 20){:</strong> We’re using an if statement to see that if the value in our list we are looping to (i) is greater than or equal to 20, to do our next action in the braces</li>



<li><strong>kw20 &lt;- rbind(kw20, gsc[i, ])}}:</strong> Let’s bring it home with our action. Our loop will take a row (i) that matches the conditions we set in the previous command and using rbind, will make it part of our kw20 frame</li>
</ul>



<p>Obviously, if we were just looking to subset our dataset according to these conditions, we’d just use subset as we saw in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">part 1</a>, and using this loop with rbind within would be very inefficient, but I hope it gives you a good example of how you can use the for loop across a list.</p>



<p>To see it in action, you can see my <a href="https://www.ben-johnston.co.uk/bulk-resizing-images-with-r/">Bulk Resizing Images in R</a> post, which features loops quite a bit.</p>



<h3 class="wp-block-heading">Using For Loops On A Dataframe</h3>



<p>Similar to our list above, let’s create a loop that subsets a dataframe if impressions are equal to or less than 20.<br><br>It’s not too dissimilar, but for the sake of this exercise, I’m going to use seq_len instead of seq_along. They’re not really very different, but it’s a more commonly-used iteration for dataframes.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kwL20 &lt;- data.frame()

for (i in seq_len(nrow(gsc))) {
  if (gsc$Impressions[i] &lt;= 20) {
    kwL20 &lt;- rbind(kwL20, gsc[i, ])
  }
}
</pre>



<p>As you can see, it’s exactly the same as on our list, aside from that I’ve changed our output dataframe to kwL20 (keywords less than 20) and used seq_len(nrow for our frame. This works more or less the same as seq_along, but is a little more explicit. I’ve also set the impressions volume to be less than or equal to 20 for this exercise.</p>



<p>So there we have it. An introduction to the for loop in R. While these are simple examples, I hope it’s given you an idea of how they can work and be used in your SEO work. Next up, lets have a look at the while loop.</p>



<h2 class="wp-block-heading">While Loops In R</h2>



<p>Where we saw the for loop, which executes our code across every item in our dataset, the <a href="https://www.w3schools.com/r/r_while_loop.asp" target="_blank">while loop</a> is a little more steady. A while loop in R will keep executing its command all the time a condition is met, and will stop when that condition is no longer true.<br><br>While loops are great for automation. I’ve used them in the past to run real-time <a data-wpil-monitor-id="205" href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/">API data</a> from Salesforce into my environment during a specific timeframe, for example. Again though, very simple to create and execute.</p>



<p>Let’s do a very simple while loop in R, looking at our Google Search Console dataset once again.</p>



<h3 class="wp-block-heading">A While Loop To Find Keywords With More Than 20 Impressions</h3>



<p>Again, this is a bit of a case of a sledgehammer to crack a nut, but hopefully this simple example will give you some ideas of where and how you can use a while loop in your day to day SEO work with R.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kw_df &lt;- data.frame(Query = character())

index &lt;- 1

while (index &lt;= nrow(gsc)) {
  if (gsc$Impressions[index] >= 20) {
    kw_df &lt;- rbind(kw_df, data.frame(Query = gsc$Top.queries[index]))
  }
  index &lt;- index + 1
}
</pre>



<p>Again, by this point in your R journey, this might look pretty simple. But let’s break it down anyway.</p>



<h3 class="wp-block-heading">The While Loop Explained</h3>



<p>Let&#8217;s dig into how this while loop works.</p>



<ul class="wp-block-list">
<li><strong>kw_df &lt;- data.frame(Query = character()):</strong> This will be familiar to you by now, but we’re creating a new dataframe called kw_df and setting the Query column to be character</li>



<li><strong>index &lt;- 1:</strong> We’re creating a numerical object called index to match our loop against, starting it with 1</li>



<li><strong>while (index &lt;= nrow(gsc)) {:</strong> Now onto our loop. We’re starting with a while command rather than for, and then instructing R that <em>while</em> the value of our index object is lower than the number of rows in our gsc dataframe, to execute our code</li>



<li><strong>if (gsc$Impressions[index] &gt;= 20) {: </strong>There’s our if statement again. In this case, our condition is that our Impressions value at the row number defined in our index is greater than, or equal to 20, to execute our code</li>



<li><strong>kw_df &lt;- rbind(kw_df, data.frame(Query = gsc$Top.queries[index]))}:</strong> As with our for loop, we’re rbinding our query that matches the relevant row from our index if our if statement is true, and adding it to kw_df</li>



<li><strong>index &lt;- index + 1}:</strong> This is the important part to keep our loop running. At the end of our loop, we add 1 to the value in our index, which will keep our while loop going – once the number in index is larger than the number of rows in our dataset, the loop will stop</li>
</ul>



<p>That’s a very simple introduction to while loops. There’s an awful lot that you can do with these, and I’m sometimes a little guilty of using them when I should use a more elegant solution because I’m in a hurry. Try them yourself – while loops in R have a lot of applications to SEO work.</p>



<h2 class="wp-block-heading">Break &amp; Next Conditions In R Loops</h2>



<p>The break and next conditions in loops are commands that either stop the loop dead once that condition is met or simply move to the next iteration based on the output.<br><br>I don’t generally use these too much, if I’m honest, aside from using them as a crude form of error handling if I’m in a hurry, but they’re worth knowing. Let’s take a look at the break condition within a repeat loop.</p>



<h3 class="wp-block-heading">Break Conditions</h3>



<p><a href="https://www.programiz.com/r/break-next" target="_blank">Break conditions</a> are more or less exactly what they sound like – a condition under which, a loop will break, or stop.</p>



<p>I don’t really use break conditions too much, largely because it’s quite rare I use <a href="https://www.datamentor.io/r-programming/repeat-loop" target="_blank">repeat loops</a>, unless they’re within a specific function and the loop is required for something. However, repeat loops are the best way to demonstrate the break condition in action.</p>



<p>Here’s a really simple repeat loop with a break condition that will take an object called repVal with a value of one, repeatedly printing that value and then adding 1 to the object each time. And then once we hit 10 repetitions of that loop, the break condition comes in and stops it.<br><br>Let’s have a look at the code and then we’ll break it down.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">repVal &lt;- 1

repeat{
  print(repVal)
  repVal &lt;- repVal + 1
  
  if(repVal > 10){
    break
  }
}
</pre>



<p>If this runs properly, you’ll get the following output in your console:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="487" height="399" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170017.png" alt="Repeat loop output with break condition in R" class="wp-image-3406" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170017.png 487w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170017-300x246.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170017-150x123.png 150w" sizes="(max-width: 487px) 100vw, 487px" /></figure>



<p>Simple, right? Let’s see how it works.</p>



<h3 class="wp-block-heading">The Repeat Loop &amp; Break Condition Explained</h3>



<p>Here’s that phrase again: let’s break it down.</p>



<ul class="wp-block-list">
<li><strong>repVal &lt;- 1:</strong> We’re creating our repVal object and assigning it a value of 1</li>



<li><strong>repeat{:</strong> This is the type of loop that we’re using, similar to the for and while loops</li>



<li><strong>print(repVal):</strong> Our loop will print the value of repVal on a continual cycle as long as our loop runs</li>



<li><strong>repVal &lt;- repVal + 1:</strong> Now we’re saying to add 1 to our repVal object every time the loop runs</li>



<li><strong>if(repVal &gt; 10){:</strong> There’s our if statement again. Nice and simple here, we’re just seeing if the value of repVal is greater than ten</li>



<li><strong>break}}:</strong> And finally, our break condition. Essentially, once our if statement becomes true (repVal has become greater than ten), it triggers our break condition and stops the loop</li>
</ul>



<p>And that’s how a break condition can be used in a repeat loop. It can be used in any of the other types of loops as well, and it can be a handy way to stop a loop once a certain condition is met.</p>



<p>Now let’s take a look at next conditions.</p>



<h3 class="wp-block-heading">Next Conditions In R Loops</h3>



<p>The <a href="https://www.datamentor.io/r-programming/break-next" target="_blank">next condition</a> is one I do use a little bit more regularly. It essentially skips to the next iteration of our command if a certain condition is not met. For example, I sometimes use it for skipping empty outputs from APIs, if there’s no data returned for a certain keyword or page, I don’t want an error, I just want it to skip to the next one.</p>



<p>Let’s use another simple example, with a for loop this time.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for (val in 1:10){

  if (val == 5){

    next
  }
  print(val)
}
</pre>



<p>Again, nice and simple and if you’ve followed along so far, you should be able to figure out what’s happening here, but let’s run through it anyway.</p>



<h3 class="wp-block-heading">The Next Condition Explained</h3>



<p>Shall we see how this example works?</p>



<ul class="wp-block-list">
<li><strong>for (val in 1:10){:</strong> In this example, we’re going to use a for loop to cycle our commands through the values of one to ten</li>



<li><strong>if (val == 5){: </strong>You’re seeing why if statements are so fundamental to programming now, right? In this case, our if statement is checking to see if we’ve looped to a point where our val is exactly equal to five</li>



<li><strong>next{:</strong> If our if statement turns out to be true, and our val does exactly equal five, we skip to the next number in our val series</li>



<li><strong>print(val):</strong> And finally, as long as we’ve not triggered our next condition, we will see a list of the numbers in our val</li>
</ul>



<p>If this all runs correctly, we should see the following in our R console:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="361" height="350" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170840.png" alt="R loop with next condition output" class="wp-image-3407" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170840.png 361w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170840-300x291.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/08/Screenshot-2024-08-19-170840-150x145.png 150w" sizes="(max-width: 361px) 100vw, 361px" /></figure>



<p>So as you can see, loops in R are quite simple and give you a really good way to iterate a command over a dataset, but there’ssome controversy about when, or indeed <em>if</em> you should ever use them in R.</p>



<h2 class="wp-block-heading">The Loop Vs Apply Debate</h2>



<p>The word is that using loops in R is dirty code. That they’re slow, that you’re using more code than you should need, that using the apply family is just <em>better</em>.</p>



<p>Personally, as someone that came to R from learning bits of a bunch of different languages and who is doing more with Python and Julia these days, loops have always made sense to me and been something of a go-to (as you’ll see in my other <a href="https://www.ben-johnston.co.uk/category/r/">R posts</a>), but I have come to appreciate the various apply methods available as well.</p>



<p>In my experience and through researching, it seems that loops in R being slow is something of a fallacy. That said, you do often end up writing more code than you would with one of the apply methods. Conversely, I’ve always found loops to be very reliable, whereas I sometimes have to take a couple of extra steps to get an apply working.</p>



<p>Still, apply methods are great, and you’ll be using them a lot as you go through your R journey, and they’ll be the subject of my next post.</p>



<h2 class="wp-block-heading">Wrapping Up</h2>



<p>I promise, I did try to make this one a bit shorter than other pieces, but there’s what you need to know about using loops in R, covering the for loop, while loop, repeat loop and the break, jump and next conditions. Try them yourself and I hope you find them useful.<br><br>Join me in the next piece, where I’ll be covering the various apply methods that you can use.<br><br>Until next time.</p>



<h3 class="wp-block-heading">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># For Loop

kwCount &lt;- 0

for (val in gsc$Impressions){
  if(val >= 20)  kwCount = kwCount+1
}

## On A List

kw20 &lt;- data.frame()

for (i in seq_along(gsc)) {
  if (gsc$Impressions[i] >= 20) {
    kw20 &lt;- rbind(kw20, gsc[i, ])
  }
}

## On A Dataframe

kwL20 &lt;- data.frame()

for (i in seq_len(nrow(gsc))) {
  if (gsc$Impressions[i] &lt;= 20) {
    kwL20 &lt;- rbind(kwL20, gsc[i, ])
  }
}

# While Loop

kw_df &lt;- data.frame(Query = character())

index &lt;- 1

while (index &lt;= nrow(gsc)) {
  if (gsc$Impressions[index] >= 20) {
    kw_df &lt;- rbind(kw_df, data.frame(Query = gsc$Top.queries[index]))
  }
  index &lt;- index + 1
}

# Break &amp; Next Conditions

## Repeat Loop With Break Condition

repVal &lt;- 1

repeat{
  print(repVal)
  repVal &lt;- repVal + 1
  
  if(repVal > 10){
    break
  }
}

## For Loop With Jump Condition

for (val in 1:10){

  if (val == 5){

    next
  }
  print(val)
}</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&amp;linkname=R%20For%20SEO%20Part%207%3A%20Loops" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-7-loops%2F&#038;title=R%20For%20SEO%20Part%207%3A%20Loops" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/" data-a2a-title="R For SEO Part 7: Loops"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/bulk-resizing-images-with-r/" class="lwrp-list-link"><img width="480" height="202" src="https://www.ben-johnston.co.uk/wp-content/uploads/2019/03/resize-images-r.png" class="attachment-480x480 size-480x480 wp-post-image" alt="resize images with R" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2019/03/resize-images-r.png 1038w, https://www.ben-johnston.co.uk/wp-content/uploads/2019/03/resize-images-r-150x63.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2019/03/resize-images-r-300x126.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2019/03/resize-images-r-768x323.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2019/03/resize-images-r-1024x431.png 1024w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">Bulk Resizing Images With R &#038; Magick</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6.png" class="attachment-480x480 size-480x480 wp-post-image" alt="r for seo part 6" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/r-for-seo-part-6-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 6: Using APIs In R</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R For SEO Part One | Ben Johnston" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 1: The Basics</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 6: Using APIs In R</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Wed, 10 Jul 2024 08:40:01 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">https://www.ben-johnston.co.uk/?p=3334</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/">R For SEO Part 6: Using APIs In R</a></p>
<p>Wow, we’re at part 6 of my R for SEO series. Welcome back. I really hope you’re finding this useful, and by...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/">R For SEO Part 6: Using APIs In R</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&#038;title=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/" data-a2a-title="R For SEO Part 6: Using APIs In R"></a></p>
<p>Wow, we’re at part 6 of my <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a> series. Welcome back. I really hope you’re finding this useful, and by now have started to use R in your work. Today we’re going to look at one of my favourite topics: using APIs in R.</p>



<p>There are, obviously, millions of APIs available, so today I will just look at a couple of my favourite SEO-specific ones which will give you a basis for using the different types. We’re going to cover SERPAPI, SEMRush and <a class="wpil_keyword_link" href="https://seranking.com/?ga=2640572&amp;source=link" target="_blank" rel="noopener" title="SE Ranking" data-wpil-keyword-link="linked" data-wpil-monitor-id="196">SE Ranking</a>, how to get data out of them, how to authenticate with them and how to work with the data they give you.</p>



<p>Now that’s out the way, let’s get started.</p>




<script>(function() {
window.mc4wp = window.mc4wp || {
listeners: [],
forms: {
on: function(evt, cb) {
window.mc4wp.listeners.push(
{
event   : evt,
callback: cb
}
);
}
}
}
})();
</script><!-- Mailchimp for WordPress v4.10.0 - https://wordpress.org/plugins/mailchimp-for-wp/ --><form id="mc4wp-form-4" class="mc4wp-form mc4wp-form-3535" method="post" data-id="3535" data-name="Signup Now" ><div class="mc4wp-form-fields"><p>
    <input type="email" name="EMAIL" placeholder="Your email address" required="">
</p>

<p>
<input type="submit" value="Sign up" />
</p></div><label style="display: none !important;">Leave this field empty if you&#8217;re human: <input type="text" name="_mc4wp_honeypot" value="" tabindex="-1" autocomplete="off" /></label><input type="hidden" name="_mc4wp_timestamp" value="1738006258" /><input type="hidden" name="_mc4wp_form_id" value="3535" /><input type="hidden" name="_mc4wp_form_element_id" value="mc4wp-form-4" /><div class="mc4wp-response"></div></form><!-- / Mailchimp for WordPress Plugin -->


<h2 class="wp-block-heading">What Is An API?</h2>



<p>An API – or Application Programming Interface – is a way of getting data out of, or pushing data into, another program or application from your own program or application.</p>



<p>I promise you, this is nowhere near as complicated as it sounds. In fact, we’ve already used two APIs fairly extensively in this series – the <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">Google Analytics and Search Console APIs</a>. These came from pre-built packages, but using your own API of choice isn’t that difficult either.</p>



<p>Let’s take a look at how to do this.</p>



<h3 class="wp-block-heading">Read The Documentation</h3>



<p>Every API does, or should, come with extensive documentation which tells you how to create your queries and run your API calls. Sadly, since R isn’t as popular as Python in the SEO space, there is a lack of R-specific documentation for most APIs, but once you have the basics of creating your calls down, you’ll be able to work most of them out.</p>



<h2 class="wp-block-heading">The Anatomy Of HTTP API Requests</h2>



<p>In R, and most other languages, the majority of APIs are called via http requests – essentially creating a URL and downloading the content of that URL into a data frame. The outputs come in a variety of formats, but JSON is the most common, and also my preferred output.</p>



<p>In their most common form, a http API request, or API call, has the following core elements:</p>



<ul class="wp-block-list">
<li><strong>The endpoint: </strong>The core URL that the API is hosted on</li>



<li><strong>The authentication: </strong>Usually a text string called the API key. Some platforms such as the SE Ranking API do this differently and I’ll cover that later in this article</li>



<li><strong>The query: </strong>The parameters in the URL that tell the API what data we want out of it</li>
</ul>



<p>There are obviously different elements in every API, but I’ve found that if you think of the call in that format, it helps. Again, <strong>read the documentation</strong>.</p>



<p>Now we’ve got that down, let’s think about how JSON works in R.</p>



<h2 class="wp-block-heading">Working With JSON In R</h2>



<p>JSON, or JavaScript Object Notation, is a standard way for data to be exported from APIs and many other systems. It’s great, but it’s not always “tidy” when it comes to working with it in R. Due to the amount of data available and how R handles nesting of data, it can be a little challenging.</p>



<p>Fortunately, there are packages and conventions within R to make working with this data an easier experience.</p>



<h3 class="wp-block-heading">The jsonlite Package</h3>



<p>There are lots of JSON-related packages available on CRAN – over 60, last time I checked – but <a href="https://cran.r-project.org/web/packages/jsonlite/index.html" target="_blank">jsonlite </a>has always served me well, so it’s the one I’m using for this series. As you go further into your R journey, you may well find a package that you prefer, and that’s fine, but I hope this gives you a grounding in using JSON APIs in R.</p>



<p>You can install the jsonlite package with the usual commands:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">install.packages("jsonlite")

library(jsonlite)</pre>



<p>And you can read the <a href="https://cran.r-project.org/web/packages/jsonlite/jsonlite.pdf" target="_blank">jsonlite documentation here</a>.</p>



<p>Now that’s installed, let’s put together our first API call using SERPAPI.</p>



<h2 class="wp-block-heading">Using SERPAPI In R</h2>



<p>There are numerous search engine scraping APIs available, but <a href="https://serpapi.com/" target="_blank">SERPAPI </a>is my favourite. It offers a fantastic amount of data from almost any geographical area for a very low cost and it works brilliantly in R. I’m a fan.</p>



<p>I’ve built a number of tools with SERPAPI over the last couple of years, but here’s a really simple one for our first call.</p>



<p>Let’s see what the SERP looks like for the term “TV Units” in London.</p>



<h3 class="wp-block-heading">Building Our First SERPAPI Call In R</h3>



<p>As I mentioned earlier, there are components to every API call, one of which being the authentication, or the API key. Fortunately, SERPAPI is very easy to sign up for and even offers a free plan of 100 calls a month, which will be more than enough for all the tutorials in this series, but if you do feel like you’ll need more calls, you won’t find many better platforms for the price.</p>



<p>You will need API keys to go through these tutorials as I can’t share mine, but the key focus for later pieces will be ones you can use for free, like SERPAPI. Firstly, go to <a href="https://serpapi.com/" target="_blank">this link and sign up for free</a>.</p>



<p>Now you’ve signed up, let’s create an object using your API key. This is easy and can be done like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">apiKey &lt;- "XXXXXXXXXXXX"</pre>



<p>Replace the X’s with your SERPAPI key, but remember to keep the speech marks.</p>



<p>Now we need to create variables for the rest of the API call.</p>



<h3 class="wp-block-heading">Creating A SERPAPI Endpoint Object In R</h3>



<p>We’re going to use SERPAPI as an example here, but to use most APIs in R, you’re going to need to create an endpoint object. Here’s how to do it.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">serpApiEndpoint &lt;- "https://serpapi.com/search.json?engine=google"</pre>



<p>This a simple object that just brings the SERPAPI endpoint into our R environment. You can recreate this with most APIs that use http requests, just change the endpoint URL accordingly.</p>



<p>Now we have our endpoint and API key variables created, we need to put our actual query together.</p>



<h3 class="wp-block-heading">Creating Our SERPAPI Query In R</h3>



<p>We now have our endpoint and API key variables created, so now we have to create our actual query. This is the third part of an API call and this is where that <a href="https://serpapi.com/search-api" target="_blank">documentation </a>becomes really important. What we’re going to do here is tell SERPAPI what we want from its API.</p>



<p>Now we’ve got our endpoint and API key variables, let’s create another variable for our query, remembering that our goal is to see the SERP for “TV units” in the UK. Here’s how to do that.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">serpApiCall &lt;- paste(serpApiEndpoint, "&amp;q=tv%20units", "&amp;location=United+Kingdom&amp;google_domain=google.co.uk", "&amp;api_key=", serpApiKey, sep = "")</pre>



<p>Let’s break that query parameter down.</p>



<ul class="wp-block-list">
<li><strong>paste(:</strong> We’re using the paste function from base R to create a string of our various parameters</li>



<li><strong>serpApiEndpoint, &#8220;&amp;q=tv%20units&#8221;: </strong>We’re calling the API endpoint we created earlier, invoking the Google search engine</li>



<li><strong>&#8220;&amp;q=tv%20units&#8221;: </strong>The “&amp;q=” parameter is adding the query to our URL – API calls are generally constructed using &amp; to define different parameters. In this case our query is “tv units”. You’ll notice that we’ve used %20 in the query here – this is the URL encoding for a space</li>



<li><strong>&amp;location=United+Kingdom&amp;google_domain=google.co.uk:</strong> We’re looking at the UK version of Google</li>



<li><strong>&#8220;&amp;api_key=&#8221;, serpApiKey:</strong> This is the parameter for the API key and in this case, rather than put the whole thing into our call, we’re using our serpApiKey object</li>



<li><strong>sep = &#8220;&#8221;):</strong> As with previous pieces, the sep parameter tells R what separator we want to use in our final pasted output. As we don’t want there to be a separator, the speech marks are empty. Don’t forget your closing bracket!</li>
</ul>



<p>Now we’ve created our full API Query, let’s run it in our R console.</p>



<h3 class="wp-block-heading">Running A SERPAPI Query In R</h3>



<p>Now we’ve built our first SERPAPI http request, we need to call it into our R environment using the jsonlite package.</p>



<p>Here’s how to do that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">serpAPI1 &lt;- fromJSON(serpApiCall, simplifyDataFrame = TRUE)</pre>



<p>This will take a couple of seconds to run, but once it does, you’ll have the entire SERP for the keyword “TV units” based in London in your R environment.</p>



<p>When you run this command, you’ll see the following in your RStudio environment explorer.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="389" height="54" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/Screenshot-2024-07-10-084702.png" alt="SERPAPI data in RStudio environment explorer" class="wp-image-3336" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/Screenshot-2024-07-10-084702.png 389w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/Screenshot-2024-07-10-084702-300x42.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/Screenshot-2024-07-10-084702-150x21.png 150w" sizes="(max-width: 389px) 100vw, 389px" /></figure>



<p>Now we have our SERPAPI data in JSON format in our R environment. Let’s see how to explore that.</p>



<h2 class="wp-block-heading">Examining JSON Data In R</h2>



<p>As you’ll have seen from the above, there’s not really that much involved in getting JSON data from an API, but it’s all nested, which can make the initial exploration a little challenging as it’s a series of lists. Fortunately, the SERPAPI output is cleanly formatted and denoted, so it’s not too difficult, and it’s way better than having it in separate CSV files, the way some APIs do.</p>



<p>If we start by using the str() command, we’ll see the following output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">str(serpApi1)</pre>



<figure class="wp-block-image size-full"><img decoding="async" width="998" height="381" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image.png" alt="SERPAPI data in RStudio console" class="wp-image-3337" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image.png 998w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-300x115.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-150x57.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-768x293.png 768w" sizes="(max-width: 998px) 100vw, 998px" /></figure>



<p>As you can see, what we have here is a series of lists with data frames nested within them. It looks a little ugly at first, but it’s actually simple once you get to grips with it.</p>



<p>Essentially, we need to think of each header we found in our str command as a different data frame, so we’ll be using the $ parameter in our exploration more than once.</p>



<p>Let’s take a look at the organic results from our SERPAPI query:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">serpAPI1$organic_results</pre>



<p>This will give you the following output:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="1009" height="356" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-1.png" alt="SERPAPI organic results in RStudio json format" class="wp-image-3338" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-1.png 1009w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-1-300x106.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-1-150x53.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-1-768x271.png 768w" sizes="(max-width: 1009px) 100vw, 1009px" /></figure>



<p>We can do the same with every frame within our export. At the time of writing, we can see that ikea.com had the number 10 position in London for the TV units term. We can add extra parameters to explore each column within the dataframes in the list, for example:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">serpAPI1$organic_results$title</pre>



<p>And that’s how we explore JSON data in R.</p>



<p>Now let’s scale our API collection.</p>



<h2 class="wp-block-heading">Running Multiple JSON API Calls In R</h2>



<p>I’ll be honest here – I’ve run <em>tons </em>of JSON APIs with R over the years, and I have found very few that work the same way. It very much depends on the API and the output, so always budget some time and API queries for trial and error, especially if your API charges by the request.</p>



<p>Now we know how to run singular SERPAPI queries in R, but we’ve just got the raw JSON data, which means we’ve got to do more work to get the information we want into a data frame. Why don’t we create a function that can dynamically create our API calls and extract the information we want into a data frame all in one go?</p>



<p>Here’s how.</p>



<h3 class="wp-block-heading">Multiple SERPAPI Queries In R</h3>



<p>Firstly, you’ll want a data frame with a few keywords that you’d like to check. You can either put them into a CSV file and read them in using the read.csv command, or you can create it directly in R like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">keywords &lt;- data.frame(keyword = c("sliding wardrobes", "bookcases", "tv units", "dining tables"))</pre>



<p>Alright, so we’ve got out data frame (call it “keywords” to follow along). Now we need to create a function to dynamically build our API calls and to pull the ranking URLs, domains, shown title and description and the position it’s in.</p>



<p>Firstly, we’ll want to re-visit our domainNames function from <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/">Part 4</a>, which will strip a URL to a domain name. The code is below:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">domainNames &lt;- function(x){
  
  strsplit(gsub("http://|https://|www\\.", "", x), "/")[[c(1, 1)]]
  
}</pre>



<p>Now let’s look at our function.</p>



<h3 class="wp-block-heading">A SERPAPI Function In R</h3>



<p>This is a fairly simple R function, and if you’ve been following the series along, there shouldn’t be anything surprising.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">serpAPIFun &lt;- function(x){
  
  require(tidyverse)
  
  require(jsonlite)
  
  apiCall &lt;- paste(serpApiEndpoint, "&amp;q=", x, "&amp;location=United+Kingdom&amp;google_domain=google.co.uk", 
                   "&amp;api_key=", serpApiKey, sep = "")
  
  apiCall &lt;- gsub(" ", "%20", apiCall)
  
  serpData &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE)
  
  serpData$organic_results$domain &lt;- domainNames(serpData$organic_results$link)
  
  output &lt;- data.frame(serpData$search_parameters$q, serpData$organic_results$link, 
                       serpData$organic_results$title, serpData$organic_results$snippet, 
                       serpData$organic_results$domain, serpData$organic_results$position)
  
  colnames(output) &lt;- c("Keyword", "URL", "Title", "Description", "Domain", "Position")
  
  return(output)
  
}</pre>



<p>As always, let’s break it down:</p>



<h3 class="wp-block-heading">How The Function Works</h3>



<p>This looks a lot more complicated than it actually is.</p>



<ul class="wp-block-list">
<li><strong>serpAPIFun &lt;- function(x){:</strong> Our function is called serpAPIFun and uses an x variable. You can, obviously, call it whatever you like. I’ve simplified it for this piece and just used one variable, but you can add others</li>



<li><strong>require():</strong> We’re adding in the packages that we need for this function. In this case, we need the tidyverse and jsonlite</li>



<li><strong>apiCall &lt;- paste(serpApiEndpoint, &#8220;&amp;q=&#8221;, x, &amp;location=United+Kingdom&amp;google_domain=google.co.uk&#8221;,  &#8220;&amp;api_key=&#8221;, serpApiKey, sep = &#8220;&#8221;):</strong> As in the single example above, we’re pasting our API URL together, with x being our keyword from our dataframe and using the UK region. If you wanted to use different regions, this could be a y variable</li>



<li><strong>apiCall &lt;- gsub(&#8221; &#8220;, &#8220;%20&#8221;, apiCall):</strong> As before, we’re using gsub to replace spaces with %20 so it doesn’t break our API call</li>



<li><strong>serpData &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE):</strong> Now our API call is completed, we can use jsonlite’s fromJSON function to pull the data into our serpData object</li>



<li><strong>serpData$organic_results$domain &lt;- domainNames(serpData$organic_results$link):</strong> This will run our domainNames function on the URL and create a new column of the domain name</li>



<li><strong>output &lt;- data.frame(:</strong> We’re creating our output dataframe, with the columns chosen above (keyword, URL, title, description,&nbsp; domain, position), but you can use whichever you would like from the dataset</li>



<li><strong>colnames(output) &lt;- c(:</strong> This uses the colnames command to rewrite the column headers in our output dataframe</li>



<li><strong>return(output):</strong> The finale of our function is to return our completed data frame</li>
</ul>



<p>Now to run it, we need to use the following command, which will run through every API call:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">serpAPI2 &lt;- reduce(lapply(keywords$keyword, serpAPIFun), bind_rows)</pre>



<p>And it will return the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="992" height="287" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-2.png" alt="SERPAPI data in a dataframe in R" class="wp-image-3339" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-2.png 992w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-2-300x87.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-2-150x43.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-2-768x222.png 768w" sizes="(max-width: 992px) 100vw, 992px" /></figure>



<p>Great, right? I’ll be running through the different kinds of <a data-wpil-monitor-id="206" href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/">loops</a> and apply commands in R in the next couple of articles, but for now, that’s how you can run multiple SERPAPI queries in R.</p>



<p>Now let’s look at the SE Ranking API.</p>



<h2 class="wp-block-heading">Using The SE Ranking API In R</h2>



<p>SE Ranking is one of my favourite SEO tools – so much so, I chose to bring it to my team at <a href="https://www.harvestdigital.com/" target="_blank">Harvest Digital</a> and I am also an affiliate. It’s a truly fantastic platform that offers everything a number of other tools do at a cheaper price, and you can’t fault the speed, functionality and support that the team give. Check it out from my link and tell them I sent you!</p>



<p>Now when it comes to the API, it’s a very full featured one, but it can be a little tricky to get some of your IDs working. So first, we want to find out all the available databases.</p>



<p>Let’s take a look at how we can construct a function in R to gather all our search engine IDs from the SE Ranking API and, as always, we’ll break down how it works.</p>



<h3 class="wp-block-heading">Finding Search Engine IDs From The SE Ranking API In R</h3>



<p>We’ll start with a really simple function to get all the available search engine IDs from the SE Ranking API into our R environment.</p>



<p>First, we want to create an object of our API key. We’ll call this seRankingAPI. Obviously you’ll need your own API key for this.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">seRankingAPI &lt;- "XXXXXXXX"</pre>



<p>Replace the X’s with your own API key.</p>



<p>Now let’s put our function together.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">seRankingDBs &lt;- function(x){
  
  apiCall &lt;- paste("https://api4.seranking.com/system/search-engines?token=", x, 
                   sep = "")
  
  databases &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE)
  
  output &lt;- data.frame(databases)
}</pre>



<p>Let’s see how it works.</p>



<h3 class="wp-block-heading">How The SE Ranking Database Function Works</h3>



<p>Here’s that phrase again: let’s break it down.</p>



<ul class="wp-block-list">
<li><strong>seRankingDBs &lt;- function(x){:</strong> We’re creating a function called seRankingDBs, with a single x variable</li>



<li><strong>apiCall &lt;- paste(&#8220;https://api4.seranking.com/system/search-engines:</strong> As before, we’re constructing our API call using the paste command. You can see that we’ve used the api4.seranking.com domain and the /system/search-engines endpoint</li>



<li><strong>?token=&#8221;, x, &nbsp;sep = &#8220;&#8221;): </strong>Here, we’re adding our API key as a query string. In this case, the API key query is called “token” and we’re using the seRankingAPI object that we just created. Again, using the sep = “” parameter means there is nothing to separate the objects that we’re pasting together.That’s our URL constructed</li>



<li><strong>databases &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE):</strong> As with SERPAPI, we’re using jsonlite’s fromJSON command to download the JSON data into an object called databases</li>



<li><strong>output &lt;- data.frame(databases)}:</strong> Finally, we’re creating our output – a dataframe of the databases in this case</li>
</ul>



<p>Told you it was simple.</p>



<p>To run it, simply type:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">seRankingDatabases &lt;- seRankingDBs(seRankingAPI)</pre>



<p>And that will give you the following output:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="691" height="222" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-3.png" alt="SE Ranking database list in RStudio" class="wp-image-3341" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-3.png 691w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-3-300x96.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-3-150x48.png 150w" sizes="(max-width: 691px) 100vw, 691px" /></figure>



<p>Now let’s create another function to get search volume on our keywords from SE Ranking in R.</p>



<h3 class="wp-block-heading">Search Volume From The SE Ranking API In R</h3>



<p>Before we get going, we’ll want to know our search engine region ID. Fortunately, we’ve already got all our databases into our R environment, so it’s not difficult to find the ID we want.</p>



<p>Let’s say we want to look at Google UK, the ID is 180 from our list.</p>



<p>Now let’s create our function.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">seRankingKeywords &lt;- function(x, y, z){
  
  apiCall &lt;- paste("https://api4.seranking.com/system/volume?region_id=", y, 
                   "&amp;keyword=", x, "&amp;token=", z, sep = "")
  
  apiCall &lt;- gsub(" ", "%20", apiCall)
  
  seRankingData &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE)
  
  seRankingData &lt;- data.frame(x, seRankingData)
  
  colnames(seRankingData) &lt;- c("Keyword", "Search Volume")
  
  output &lt;- seRankingData
}</pre>



<p>I’ve gone a little backwards here on the placement of the y and x variables, but that’ll make sense in a second.</p>



<h3 class="wp-block-heading">How The SE Ranking Keyword Volume Function Works</h3>



<p>As always, let’s break it down.</p>



<ul class="wp-block-list">
<li><strong>seRankingKeywords &lt;- function(x, y, z){:</strong> Our function is called seRankingKeywords and we’re using x, y and z variables</li>



<li><strong>apiCall &lt;- paste(&#8220;https://api4.seranking.com/system/volume?region_id=&#8221;, y, &#8220;&amp;keyword=&#8221;, x, &#8220;&amp;token=&#8221;, z, sep = &#8220;&#8221;):</strong> As previously, we’re constructing our URL. Here, we’re using y to be our region ID, x for our keywords and z for our API key</li>



<li><strong>apiCall &lt;- gsub(&#8221; &#8220;, &#8220;%20&#8221;, apiCall):</strong> We’re again using the gsub command to replace spaces with the URL encoding of space &#8211; %20</li>



<li><strong>seRankingData &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE):</strong> As before, we’re using jsonlite’s fromJSON command to download our data</li>



<li><strong>seRankingData &lt;- data.frame(x, seRankingData):</strong> We’re turning that JSON data into a dataframe with the columns x (our keyword) and the search volume output from the API</li>



<li><strong>colnames(seRankingData) &lt;- c(&#8220;Keyword&#8221;, &#8220;Search Volume&#8221;): </strong>We’re naming our columns “Keyword” and “Search Volume”</li>



<li><strong>output &lt;- seRankingData}:</strong> Finally, we want to return our full dataframe</li>
</ul>



<p>Again, it’s nice and simple.</p>



<p>To run it, let’s use our keywords dataframe from earlier.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">seRankingRun &lt;- reduce(lapply(keywords$keyword, seRankingKeywords, "180", seRankingAPI), 
                       bind_rows)</pre>



<p>Here, we’re using our keywords dataframe as the keywords (x), our function name (seRankingKeywords), our Google UK region ID (“180”) and our seRanking API. Nice and simple. This command will loop through all our keywords and merge the data into a single dataframe.</p>



<p>If you run it in your console and then type “seRankingRun”, you’ll see the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="346" height="118" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-4.png" alt="Search volumes from SE Ranking in R" class="wp-image-3342" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-4.png 346w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-4-300x102.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-4-150x51.png 150w" sizes="(max-width: 346px) 100vw, 346px" /></figure>



<p>And there we go. That’s how you can use the SE Ranking API in R. There’s a lot that you can do with this API and I may well go into further depth on it in a later post when I’ve finished this series as it’s very impressive.</p>



<p>Now let’s look at arguably the most popular SEO tool on the market: SEMRush.</p>



<h2 class="wp-block-heading">The SEMRush API In R</h2>



<p>The <a class="wpil_keyword_link" href="https://semrush.sjv.io/c/3960766/1857951/13053" target="_blank" rel="noopener" title="SEMRush" data-wpil-keyword-link="linked" data-wpil-monitor-id="203">SEMRush</a> API is another great tool and is essential for a lot of SEO work, and I use it a lot with R. Disclaimer again – I’ve been a customer of SEMRush for more years than I care to remember, and I am also an affiliate, because it’s so awesome.</p>



<p>The SEMRush API also has the option to export to CSV or JSON. Since we’ve covered JSON in a fair amount of depth today, I’m going to show you the CSV export option in R, which is actually its standard operating procedure.</p>



<p>Due to the way the SEMRush API handles error reporting when a keyword isn’t in its database, this is also a really good opportunity to incorporate some error handling in our function. So, let’s get going.</p>



<h3 class="wp-block-heading">Gathering Keyword Data From The SEMRush API In R</h3>



<p>Authenticating with the SEMRush API in R is very similar to how we did it with SERPAPI – just adding our API key to the query string, so that’s handy.</p>



<p>Let’s get going.</p>



<p>Firstly, we want to create an object with our API key. This works in the same way as other APIs, so very simple:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">semRushAPI &lt;- "XXXXXXXXX"</pre>



<p>Replace the X’s with your own API key again and don’t forget the speech marks.</p>



<p>Now we want to start building our function and gathering our data. Here’s how.</p>



<h3 class="wp-block-heading">Gathering Keyword Data From The SEMRush API In R</h3>



<p>Our function isn’t too different to our others, but we’ve got a new data source in there, which is <a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">read.csv</a> compared to our usual fromjson command. You’ll also notice that we’re using a different separator than usual and we’re also going to use a different command to run it.</p>



<p>Let’s take a look at our function, and then we’ll break it down.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">semRushKeywordData &lt;- function(x, y){
  
  apiCall &lt;- paste("https://api.semrush.com/?type=phrase_this&amp;key=", semRushAPI, 
                   "&amp;phrase=", x, "&amp;export_columns=Ph,Nq,Cp,Co,Nr,Td,In&amp;database=", 
                   y, sep = "")
  
  apiCall &lt;- gsub(" ", "%20", apiCall)
  
  semRushData &lt;- read.csv(apiCall, header = TRUE, sep = ";", stringsAsFactors = FALSE)

}</pre>



<h3 class="wp-block-heading">How The SEMRush Keyword Research Function Works</h3>



<p>As you can see, it’s not a million miles away from our other functions, so you can see that using APIs in R follows fairly similar rules. Let’s see what goes into this one.</p>



<ul class="wp-block-list">
<li><strong>semRushKeywordData &lt;- function(x, y){:</strong> We’re creating our semRushKeywordData function and using x and y variables</li>



<li><strong>apiCall &lt;- paste(https://api.semrush.com/?type=phrase_this&amp;key=: </strong>As previously, we’ve got our endpoint and other elements that we want to paste together to build our API call URL. The ?type=phrase_this query says that we’re using the keywordn research function of the API</li>



<li><strong>semRushAPI, &nbsp;&#8220;&amp;phrase=&#8221;, x,:</strong> We’re adding the semRushAPI key object we created earlier and our x variable for the phrase, or our keyword from our dataset</li>



<li><strong>&#8220;&amp;export_columns=Ph,Nq,Cp,Co,Nr,Td,In:</strong> We’re looking to extract the following columns from SEMRush with this data: Keyword, Search Volume , CPC, Competition Number of Results, Trends and Intent. I told you there was a lot of information you could get from this API!</li>



<li><strong>&amp;database=&#8221;, y, sep = &#8220;&#8221;):</strong> SEMRush has databases all over the world, so you can pull data relevant to your geographic region. <a href="https://developer.semrush.com/api/v3/analytics/basic-docs/#databases/" target="_blank">The documentation has the full list</a>, and for this function, I’ve set the database to our y variable, so you can change it based on your location. We’re finishing the paste command up with sep=”” as usual, so there are no spaces in the URL</li>



<li><strong>apiCall &lt;- gsub(&#8221; &#8220;, &#8220;%20&#8221;, apiCall):</strong> As with our other API calls in our R environment, we’re using gsub to replace any spaces in our keywords with %20 to encode them</li>



<li><strong>semRushData &lt;- read.csv(apiCall, header = TRUE, sep = &#8220;;&#8221;, stringsAsFactors = FALSE)}:</strong> Now to finish the function with our trusty read.csv command. You’ll notice that there are a couple of extra parameters in there for this one, namely header = TRUE (which does exactly what you’d think it would) and sep = “;” – this is there because the SEMRush API sends data separated by a semicolon rather than your traditional commas, largely because it nests trend data with commas</li>
</ul>



<p>And there we go, a nice simple function to get SEMRush data from the API in R. Now to run it, we do the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">semRushOutput &lt;- do.call(rbind, lapply(keywords$keyword, semRushKeywordData, "uk"))</pre>



<p>You’ll notice that this is slightly different to the other commands we’ve used to run our API calls before. This is because of the way the SEMRush API sends out CSV data, the usual “reduce” command tends to give an error. By using do.call(rbind we’re essentially doing the same thing, but it tends to work a bit better with CSV data. You may still get some warnings, but not errors.</p>



<p>Now if you run your command on the keywords frame we created earlier, you’ll get the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="690" height="223" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-5.png" alt="Data from the SEMRush API in R" class="wp-image-3345" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-5.png 690w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-5-300x97.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/07/image-5-150x48.png 150w" sizes="(max-width: 690px) 100vw, 690px" /></figure>



<p>Handy, right? And this is just the beginning of what you can do with APIs in R.</p>



<h2 class="wp-block-heading">Wrapping Up</h2>



<p>So there we have it – now you know how to use different types of APIs in R, ones with JSON and CSV outputs, and how to use them for SEO purposes.</p>



<p>Obviously there are millions of APIs out there, and I haven’t even scratched the surface of what’s possible, but hopefully this crash course will be enough to get you started.</p>



<p>As always, if you have any questions, please hit me up on <a href="https://x.com/ben_johnston80" target="_blank">Twitter/ X</a> and if you’d like to get an alert of when the next post drops, please sign up for my email list below.</p>



<p>Until next time, where we’ll be looking at different ways of running functions with loops and applys. I hope you’ll join me.</p>



<h3 class="wp-block-heading">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Install Packages

install.packages("jsonlite")

library(jsonlite)

install.packages("tidyverse")

library(tidyverse)

# SERPAPI

serpApiKey &lt;- "XXXXXXXXXXXXX"

## Endpoint

serpApiEndpoint &lt;- "https://serpapi.com/search.json?engine=google"

## TV Units Call

serpApiCall &lt;- paste(serpApiEndpoint, "&amp;q=tv%20units", "location=United+Kingdom&amp;google_domain=google.co.uk", 
                     "&amp;api_key=", serpApiKey, sep = "")

serpAPI1 &lt;- fromJSON(serpApiCall, simplifyDataFrame = TRUE)

#Exploring JSON

str(serpAPI1)

serpAPI1$organic_results

serpAPI1$organic_results$title

## Create Keywords Dataframe

keywords &lt;- data.frame(keyword = c("sliding wardrobes", "bookcases", "tv units",
                                   "dining tables"))

## Running Multiple Keywords In SERPAPI

## Domain Names Function

domainNames &lt;- function(x){
  
  strsplit(gsub("http://|https://|www\\.", "", x), "/")[[c(1, 1)]]
  
}

## SERPAPI Function

serpAPIFun &lt;- function(x){
  
  require(tidyverse)
  
  require(jsonlite)
  
  apiCall &lt;- paste(serpApiEndpoint, "&amp;q=", x, "&amp;location=United+Kingdom&amp;google_domain=google.co.uk", 
                   "&amp;api_key=", serpApiKey, sep = "")
  
  apiCall &lt;- gsub(" ", "%20", apiCall)
  
  serpData &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE)
  
  serpData$organic_results$domain &lt;- domainNames(serpData$organic_results$link)
  
  output &lt;- data.frame(serpData$search_parameters$q, serpData$organic_results$link, 
                       serpData$organic_results$title, serpData$organic_results$snippet, 
                       serpData$organic_results$domain, serpData$organic_results$position)
  
  colnames(output) &lt;- c("Keyword", "URL", "Title", "Description", "Domain", "Position")
  
  return(output)
  
}

serpAPI2 &lt;- reduce(lapply(keywords$keyword, serpAPIFun), bind_rows)

## SE Ranking

seRankingAPI &lt;- "XXXXXXXXXXXXX"

seRankingDBs &lt;- function(x){
  
  apiCall &lt;- paste("https://api4.seranking.com/system/search-engines?token=", x, 
                   sep = "")
  
  databases &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE)
  
  output &lt;- data.frame(databases)
}

seRankingDatabases &lt;- seRankingDBs(seRankingAPI)

## SE Ranking Keyword Volume

seRankingKeywords &lt;- function(x, y, z){
  
  apiCall &lt;- paste("https://api4.seranking.com/system/volume?region_id=", y, 
                   "&amp;keyword=", x, "&amp;token=", z, sep = "")
  
  apiCall &lt;- gsub(" ", "%20", apiCall)
  
  seRankingData &lt;- fromJSON(apiCall, simplifyDataFrame = TRUE)
  
  seRankingData &lt;- data.frame(x, seRankingData)
  
  colnames(seRankingData) &lt;- c("Keyword", "Search Volume")
  
  output &lt;- seRankingData
}

seRankingRun &lt;- reduce(lapply(keywords$keyword, seRankingKeywords, "180", seRankingAPI), 
                       bind_rows)

# SEMRush

semRushAPI &lt;- "XXXXXXXXXX"

semRushKeywordData &lt;- function(x, y){
  
  apiCall &lt;- paste("https://api.semrush.com/?type=phrase_this&amp;key=", semRushAPI, 
                   "&amp;phrase=", x, "&amp;export_columns=Ph,Nq,Cp,Co,Nr,Td,In&amp;database=", 
                   y, sep = "")
  
  apiCall &lt;- gsub(" ", "%20", apiCall)
  
  semRushData &lt;- read.csv(apiCall, header = TRUE, sep = ";", stringsAsFactors = FALSE)

}

semRushOutput &lt;- do.call(rbind, lapply(keywords$keyword, semRushKeywordData, "uk"))
</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&amp;linkname=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-6-using-apis-in-r%2F&#038;title=R%20For%20SEO%20Part%206%3A%20Using%20APIs%20In%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/" data-a2a-title="R For SEO Part 6: Using APIs In R"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO part 4: functions" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 4: Functions</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png" class="attachment-480x480 size-480x480 wp-post-image" alt="" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 2: Packages" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 5: Common Excel Formulas In R</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Mon, 26 Feb 2024 21:18:59 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">http://167.71.131.91/?p=3248</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/">R For SEO Part 5: Common Excel Formulas In R</a></p>
<p>Welcome back. It’s part 5 of my R for SEO series and I hope you’re all finding it useful so far. Up...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/">R For SEO Part 5: Common Excel Formulas In R</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&#038;title=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" data-a2a-title="R For SEO Part 5: Common Excel Formulas In R"></a></p>
<p>Welcome back. It’s part 5 of my <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a> series and I hope you’re all finding it useful so far. Up to now, we’ve covered <a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">the basics</a>,<a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/"> using packages and Google Analytics &amp; Search Console</a>, <a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/">data visualisation with GGPlot2 and wordcloud</a> and in our last piece, we looked at <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/">R functions for SEO</a>. Now let’s start seeing how the power of R can help us replicate the common Excel formulas we use in our day-to-day, but faster and on larger datasets.</p>



<p>We’re going to use some of the datasets we’ve already created over the course of this series today for examples, but my goal is to make these commands and functions at least mostly reproducible based on what we’ve already learned to this point.</p>



<p>As always, if you have questions, please feel free to hit me up on Twitter or drop me a line through the contact form, and I hope you’ll consider signing up to my <strong>free</strong> mailing list. No spam, no sales pitches, just emails when I release new content.</p>




<script>(function() {
window.mc4wp = window.mc4wp || {
listeners: [],
forms: {
on: function(evt, cb) {
window.mc4wp.listeners.push(
{
event   : evt,
callback: cb
}
);
}
}
}
})();
</script><!-- Mailchimp for WordPress v4.10.0 - https://wordpress.org/plugins/mailchimp-for-wp/ --><form id="mc4wp-form-5" class="mc4wp-form mc4wp-form-3535" method="post" data-id="3535" data-name="Signup Now" ><div class="mc4wp-form-fields"><p>
    <input type="email" name="EMAIL" placeholder="Your email address" required="">
</p>

<p>
<input type="submit" value="Sign up" />
</p></div><label style="display: none !important;">Leave this field empty if you&#8217;re human: <input type="text" name="_mc4wp_honeypot" value="" tabindex="-1" autocomplete="off" /></label><input type="hidden" name="_mc4wp_timestamp" value="1738006258" /><input type="hidden" name="_mc4wp_form_id" value="3535" /><input type="hidden" name="_mc4wp_form_element_id" value="mc4wp-form-5" /><div class="mc4wp-response"></div></form><!-- / Mailchimp for WordPress Plugin -->


<h2 class="wp-block-heading">The Ifelse Statement In R</h2>



<p>The <a href="https://www.w3schools.com/r/r_if_else.asp" target="_blank">If statement</a> is one of the most common queries used in programming and as SEOs, we use it a lot in Excel, when we’re trying to find data that matches our specific criteria or does <em>not</em> match our criteria. Here’s how we can run a similar if statement in R:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gscData$Fifty.Or.More.Impressions &lt;- ifelse(gscData$Impressions >=50, "YES", "NO")</pre>



<p>Here, we’re looking at our Google Search Console query dataset that we created a couple of weeks ago (or you can just download your own data from Search Console and read it in using read.csv) and adding another column to it to say if a query has fifty or more impressions.</p>



<p>Again, a fairly simplistic usage, but hopefully it gives you an idea of how the command works.</p>



<p>It’ll give the following output by saying “YES” or “NO” against queries with fifty impressions or more. Obviously, your data will vary.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="841" height="185" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-ifelse-output.png" alt="R ifelse output" class="wp-image-3250" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-ifelse-output.png 841w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-ifelse-output-300x66.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-ifelse-output-150x33.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-ifelse-output-768x169.png 768w" sizes="(max-width: 841px) 100vw, 841px" /></figure>



<p>Let’s break it down.</p>



<h3 class="wp-block-heading">The Anatomy Of An Ifelse Command In R</h3>



<p>Breaking down our R ifelse command, we can see it works in the following way:</p>



<ul class="wp-block-list">
<li><strong>gscData$Fifty.Or.More.Impressions &lt;-:</strong> We’re adding the snappily-named Fifty.Or.More.Impressions column to our gscData object</li>



<li><strong>ifelse(gscData$Impressions &gt;=50: </strong>This calls the <a href="https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/ifelse" target="_blank">ifelse function</a> from base R and tells it the condition for the function is that our gscData Impressions column is greater than or equal to 50</li>



<li><strong>&#8220;YES&#8221;, &#8220;NO&#8221;): </strong>Much like its Excel counterpart, IF, our ifelse function needs responses for if our condition is or is not met.To keep this one simple, we’re just saying “YES” if there are 50 or more impressions and “NO” if not</li>
</ul>



<p>Now we’ve got the basics of the ifelse down, let’s use a slightly longer version which we can add more conditions to and make it a little more complicated.</p>



<h2 class="wp-block-heading">If Else In R With Multiple Conditions</h2>



<p>While we can use ifelse with multiple criteria, there’s a whole world of if statements that are possible in R if we go into slightly more complex statements.</p>



<p><a href="https://www.geeksforgeeks.org/r-if-statement/" target="_blank">If</a> and <a href="https://www.w3schools.com/r/r_if_else.asp" target="_blank">else</a> can be broken down into separate commands, all with their own statements attached, making it incredibly flexible. Almost like replicating Excel’s IFS function in R.</p>



<p>Let’s create another fairly simple one using the same Google Search Console dataset and, as we saw in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/">part 4</a>, turn it into a flexible and reproducible function.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gscIFELSEFun &lt;- function(x, y){
  
  if(x >= y){
    return("Greater or Equal")
  }else{
    return("Less")
  }
  
}</pre>



<p>Now to run it, paste the following command into your console:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gscData$Function &lt;- sapply(gscData$Impressions, gscIFELSEFun, 50)</pre>



<p>As you can see, it’s slightly more complex, but runs very fast and, if you’ve got a decently-sized dataset, will take much less time than doing it in Excel.</p>



<h3 class="wp-block-heading">How Our If Else R Function Works</h3>



<p>As always, let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>gscIFELSEFun &lt;- function(x, y){:</strong> Another <em>very</em> snappy name for today. This function is called gscIFELSEFun and uses x and y variables</li>



<li><strong>if(x &gt;= y){: </strong>We’re calling the if function and using our variables for the conditions for the output beyond the braces. In this case, X is our dataset and y is our value. For the purposes of this piece, we want to see if our data is greater or equal to our y condition</li>



<li><strong>return(&#8220;Greater or Equal&#8221;)}: </strong>Should the condition from our if statement (x being greater or equal to y) be met, it should return the specified value – “Greater or Equal” in this case</li>



<li><strong>else{return(&#8220;Less&#8221;)}:</strong> Now to the else phase of our function. If the conditions of our statement are not met, we want it to return the value “Less”</li>
</ul>



<p>Now this looks remarkably similar to our previous command, doesn’t it? So why have we added extra steps and turned it into a function?</p>



<p>For this particular example, it’s a learning exercise, obviously. But this is how we would go about building up increasingly complex if else statements in R, with multiple conditions, multiple outputs and truly replicating Excels IFS function, or nested IF statements.</p>



<p>Now let’s use what we learned and replicate Excel’s IFS formula with a nested if else function in R.</p>



<h2 class="wp-block-heading">Nesting If Else Statements To Replicate Excel IFS Formula In R</h2>



<p>IFS is one of the main formulas I find myself using in Excel. It’s very handy to find matches across multiple conditions, and we can easily replicate it in R by nesting our if else commands and make them fast and reproducible in a function.</p>



<p>Let’s take our above example and create a function that tells us if our Search Console impressions are greater than, equal to or less than 50.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gscNestedFun &lt;- function(x, y){
  
  if(x > y){
    return("Greater")
  }else{
    if(x &lt; y){
      return("Less")
    }else{
      if(x == y){
        return("Equal")
      }
    }
  }
}</pre>



<p>Wow, that’s a lot of closing braces, right? But it should be fairly self-explanatory if you’ve been following along.</p>



<p>Let’s break it down.</p>



<ul class="wp-block-list">
<li><strong>gscNestedFun &lt;- function(x, y){: </strong>As always, we’ve got the name of our function, our x and y variables and our opening braces to start our function</li>



<li><strong>if(x &gt; y){ return(&#8220;Greater&#8221;)}: </strong>Again, we’ve got our starting if statement. In this part, we’re looking to see if our Search Console impressions are greater than 50. If they are, the function returns “Greater”</li>



<li><strong>else{if(x &lt; y){return(&#8220;Less&#8221;)}: </strong>Our first else command says that if the previous condition isn’t met, to create another if statement to see if our impressions are less than 50. If they are, return “Less” in our output</li>



<li><strong>else{if(x == y){return(&#8220;Equal&#8221;)}:</strong> And for our final else-if commands, we’re seeing if our impressions are equal to 50 and returning “Equal” if they are. As mentioned in part 1, if we want exact matches, we need to double up on the equals symbol in R. Don’t forget all those closing braces!</li>
</ul>



<p>To run it, as before, we need to use sapply:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gscData$NestedFunction &lt;- sapply(gscData$Impressions, gscNestedFun, 50)</pre>



<p>And if we look at a table of our output, you’ll see something like this (although your numbers will be different).</p>



<figure class="wp-block-image size-full"><img decoding="async" width="323" height="85" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/ifelse-r-table.png" alt="A table of R results, from an ifelse function" class="wp-image-3251" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/ifelse-r-table.png 323w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/ifelse-r-table-300x79.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/ifelse-r-table-150x39.png 150w" sizes="(max-width: 323px) 100vw, 323px" /></figure>



<p>That’s a good primer to using the if, else and ifelse commands and how we can use it to replicate a couple of common Excel formulas in R. Most of programming comes down to if and else to varying degrees, so there’s a lot that we can do here.</p>



<h2 class="wp-block-heading">Excel Countif In R</h2>



<p>When I use Excel, particularly on larger datasets, I find myself using <a href="https://support.microsoft.com/en-gb/office/countif-function-e0de10c6-f885-4e71-abb4-1f464816df34" target="_blank">countif </a>quite a lot. Obviously, the whole point of using R for SEO is to work with larger datasets quickly and efficiently, so the countif is definitely something that should be in your arsenal.</p>



<p>Fortunately, this is very easy to do. While you can use the basic if/ else statement above and work through it accordingly, you can actually do this all in one line using some base R syntax.</p>



<p>Again, let’s run it on our Google Search Console dataset to see how many queries have 50 impressions or more:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sum(gscData$Impressions >= 50, na.rm=TRUE) </pre>



<p>And if we run this command, we’ll get the following output (again, your data will be different):</p>



<figure class="wp-block-image size-full"><img decoding="async" width="458" height="40" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result.png" alt="Excel countif in R" class="wp-image-3253" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result.png 458w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result-300x26.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result-150x13.png 150w" sizes="(max-width: 458px) 100vw, 458px" /></figure>



<p>How does this work? Like so:</p>



<ul class="wp-block-list">
<li><strong>sum(:</strong> The sum command is simply telling us to add up the number of results in our chosen criteria</li>



<li><strong>gscData$Impressions:</strong> Our dataset that we’re running the operation on (if this were a function, it’d be x)</li>



<li><strong>&gt;=50,:</strong> The criteria for our sum (the “if” if you will). Here I’ve used &gt;=50 (greater than or equal to 50) for an example, but it can be pretty much whatever you want it to be</li>



<li><strong>na.rm=TRUE):</strong> Numbers which don’t match our criteria will not be counted</li>
</ul>



<p>Nice and simple, right? That’s how we can replicate Excel’s countif formula in R.</p>



<p>Now let’s look at sumif.</p>



<h2 class="wp-block-heading">Excel Sumif In R</h2>



<p><a href="https://support.microsoft.com/en-gb/office/sumif-function-169b8c99-c05c-4483-a712-1697a653039b" target="_blank">Excel’s Sumif </a>– adding together the numbers that match a certain criteria – is similarly simple in R and, again, if you’ve got quite a lot of data, a lot faster than running in Excel. Here’s how we can do that.</p>



<p>As above, we’re going to focus on our Search Console dataset.</p>



<p>This is another very simple Excel formula that we can replicate with base R. We can run a Sumif like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sum(gscData$Impressions[gscData$Impressions >= 50])</pre>



<p>If you run that in your console, you’ll see something like the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="458" height="40" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result-1.png" alt="R sumif result" class="wp-image-3254" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result-1.png 458w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result-1-300x26.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-countif-result-1-150x13.png 150w" sizes="(max-width: 458px) 100vw, 458px" /></figure>



<p>Shall we see how it works?</p>



<h3 class="wp-block-heading">How Sumif Replication In R Works</h3>



<p>As always, let’s break it down.</p>



<ul class="wp-block-list">
<li><strong>sum(gscData$Impressions: </strong>Like we did before, we’re running the sum command on our gscData Impressions column, but there’s one very important difference that makes this a sumif rather than a countif</li>



<li><strong>[gscData$Impressions &gt;= 50]):</strong> As an added condition to our previous sum() command, we’re creating a subset of all impressions that are greater than or equal to 50 and feeding back that these figures should be added together rather than just counted</li>
</ul>



<p>Very simple, isn’t it? And much faster than Excel. I’m hoping by this point, you’ll be seeing the advantages of using R for much of your SEO analysis work rather than relying on spreadsheets.</p>



<h2 class="wp-block-heading">Index Match Or Vlookup In R</h2>



<p><a href="https://support.microsoft.com/en-gb/office/look-up-values-with-vlookup-index-or-match-68297403-7c3c-4150-9e3c-4d348188976b" target="_blank">Index match in Excel</a> (or Vlookup, if you’re a dirty heathen) is a great way of matching datasets with a common criteria. Honestly, it’s something I find I have to do quite a lot due to working across ranges of datasets, but fortunately R makes it very easy and quick.</p>



<p>There are two key ways I’m going to show you how to do this today – one using the <a href="https://dplyr.tidyverse.org/reference/mutate-joins.html" target="_blank">left_join function</a> from Dplyr in the Tidyverse package (the quickest and most efficient way), and also a function to let you do it without the Tidyverse if you can’t use it for whatever reason.</p>



<p>Firstly, let’s look at emulating an index match using left_join.</p>



<h3 class="wp-block-heading">Preparing Our Dataset For Index Match/ Vlookup Emulation With R</h3>



<p>Here, we’re going to return to our TV Units dataset from <a href="https://seranking.com/?ga=2640572&amp;source=link" target="_blank">SE Ranking</a> that we used in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/">part 4</a> and create a separate frame for our domains, using the domainNames function we created there. To save you clicking around into the previous article, you’ll find the dataset here, and the function is as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">domainNames &lt;- function(x){

  strsplit(gsub("http://|https://|www\\.", "", x), "/")[[c(1, 1)]]
  
}</pre>



<p>If you’ve been following along, it shouldn’t be too scary to import that dataset using read.csv as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tvUnits &lt;- read.csv("tvUnits.csv", stringsAsFactors = FALSE)</pre>



<p>And to create our secondary datasets, we run the following commands:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tvUnits2 &lt;- data.frame(tvUnits$URL)</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">names(tvUnits2) &lt;- names(tvUnits["URL"])</pre>



<p>That’s our datasets ready to go. Now let’s get back to it.</p>



<h3 class="wp-block-heading">Index Match Or Vlookup In R With Left_Join</h3>



<p>Running an index match/ vlookup emulation in R is a really quick and easy command, no matter how large your dataset is. It’s certainly faster than doing it in Excel – especially if you use vlookup.</p>



<p>You may have gathered that I look down upon vlookups. You would be correct in that assumption.</p>



<p>Anyway, here’s what a left_join-based index match/ vlookup command would look like in R, using the datasets that we just created. Remember that you will need dplyr or the <a href="https://www.tidyverse.org/" target="_blank">Tidyverse packages</a> installed to make this work.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tvUnits$Domain &lt;- tvUnits %>% 
  left_join(tvUnits2, by = "URL")</pre>



<p>As you can see, the two datasets have merged using the URL as the anchor, which is why it’s important to make sure there’s a consistent anchor value.</p>



<p>Let’s take a look at how this works:</p>



<ul class="wp-block-list">
<li><strong>tvUnits$Domain &lt;- tvUnits %&gt;%:</strong> We’re adding our “Domain” column to our tvUnits dataset, using the Tidyverse’s “chain” command</li>



<li><strong>left_join(tvUnits2, by = &#8220;URL&#8221;):</strong> The dplyr/ Tidyverse left_join command is how we replicate the Excel index match or vlookup function – it joins our dataframes by our chosen column (“URL” in this case) with all columns to the right of that parameter</li>
</ul>



<p>Left_join is by far the quickest and easiest way to emulate Excel’s index match or vlookup function, however, it can sometimes end up causing issues if your two dataframes are of different sizes.</p>



<h3 class="wp-block-heading">An Index Match Function In R Without Dplyr</h3>



<p>Sometimes, there might be occasions where you can’t use a certain R package, such as the Tidyverse. You may have a project that’s dependent on a package that conflicts with it, your IT department might block the installation of packages (I’ve been there and it’s <em>infuriating</em>), there are lots of possible reasons, so it’s worth learning ways around the problem.</p>



<p>In the case of replicating Excel’s index match or vlookup without Dplyr (meaning we can’t use left_join), here’s a simple function that you can use. In some cases, when you only want a specific column from a dataset, this can actually be a little smoother than left_join.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">indexMatch &lt;- function(x, y, z){
  
  row &lt;- match(x, y)
  
  targetVal &lt;- z[row]
  
}</pre>



<p>To run this function, you need to have the following parameters to hand:</p>



<ul class="wp-block-list">
<li><strong>x: </strong>The range we’re searching</li>



<li><strong>y:</strong> The value we’re matching against</li>



<li><strong>z:</strong> Our target data range</li>
</ul>



<p>Now that we’ve got those, let’s run our index match R function.</p>



<p>We can do this like so, using our two tvUnits dataframes.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tvUnits$Domain2 &lt;- indexMatch(tvUnits$URL, tvUnits2$URL, tvUnits2$Domain)</pre>



<p>Now you’ll see a new column in your dataset called “Domains2”, which, as before, matches the domain from our tvUnits2 frame against the URL in the tvUnits domain, similar to the left_join command.</p>



<h3 class="wp-block-heading">How The IndexMatch Function Works</h3>



<p>Let’s break the function down:</p>



<ul class="wp-block-list">
<li><strong>indexMatch &lt;- function(x, y, z){:</strong> Our function is called indexMatch and we’ve got x, y and z variables as we covered above</li>



<li><strong>row &lt;- match(x, y):</strong> We’re creating an object called row, using base R’s <a href="https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/match" target="_blank">match function</a> to find the row number where the data we’re searching through (x) matches our target value (y) in the same way a vlookup or index match works in Excel</li>



<li><strong>targetVal &lt;- z[row]}:</strong> Now we know which row we’re targeting based on our match function, we want to get our target value by extracting our row number from the target range</li>
</ul>



<p>And there we go. Excel’s index match/ vlookup formulas emulated in R, using the Dplyr package from the Tidyverse and also with a function using base R.</p>



<h2 class="wp-block-heading">Pivot Tables In R</h2>



<p><a href="https://support.microsoft.com/en-gb/office/create-a-pivottable-to-analyze-worksheet-data-a9a84538-bfe9-40a9-a8e9-f99134456576" target="_blank">Pivot tables </a>are something many SEOs use a lot in Excel, and it’s easy to see why. They’re a brilliant way to present and group large amounts of data in an easy-to-digest format.</p>



<p>But again, the problems of <em>too much</em> data can cause serious issues with Excel, which is why we’re using R in the first place, right?</p>



<p>So here’s how you can create the super-useful pivot table in R using the <a href="http://www.pivottabler.org.uk/" target="_blank">pivottabler package</a> and export them in a number of different client-friendly formats, rather than sending a 200mb Excel file that will kill their computer.</p>



<p>First, we need to install the package. As usual, it’s the common commands:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">install.packages("pivottabler")

library(pivottabler)</pre>



<p>Now we’ve got to get our data ready for a pivot table, which means we need a consistent point. For this example, we’re going to use a Google Analytics export from GA4 with landing page and channel as our dimensions.</p>



<p>If you’ve followed the series thus far, this will be fairly simple. You can refresh your knowledge with <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">Part 2</a>, or you can just export from the GA4 site.</p>



<p>Now we have our data, we need to create our pivot table.</p>



<h2 class="wp-block-heading">Our First Pivot Table With Pivottabler</h2>



<p>We’re going to create our first pivot table using the landing page as the key and the session channel as the second dimension in the summary.</p>



<p>Here’s how to create a very simple one and then we’ll start expanding later.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pt1 &lt;- PivotTable$new()
pt1$addData(gaData)
pt1$defineCalculation(calculationName = "Total Sessions", summariseExpression = "sum(Sessions)")
pt1$addRowDataGroups("Landing.page")
pt1$addRowDataGroups("Session.default.channel.group")
pt1$renderPivot()</pre>



<p>This command will create the following output in your RStudio viewer window:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="557" height="505" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205725.png" alt="First pivot table created with R" class="wp-image-3258" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205725.png 557w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205725-300x272.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205725-150x136.png 150w" sizes="(max-width: 557px) 100vw, 557px" /></figure>



<p>As you can see, it gives us a simple html output, which looks quite pretty and can be exported as an image or HTML like so.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="609" height="274" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205834.png" alt="How to export an R pivot table to HTML from RStudio" class="wp-image-3259" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205834.png 609w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205834-300x135.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-205834-150x67.png 150w" sizes="(max-width: 609px) 100vw, 609px" /></figure>



<p>Now, let’s have a look at how our R pivot table works.</p>



<h3 class="wp-block-heading">How The R Pivot Table Works</h3>



<p>Here’s that phrase again: let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>pt1 &lt;- PivotTable$new(): </strong>We’re creating an object called pt1, using a PivotTable function and creating a “new” column</li>



<li><strong>pt1$addData(gaData): </strong>We’re invoking our dataset with our pivot table. The gaData frame in this case</li>



<li><strong>pt1$defineCalculation(calculationName = &#8220;Total Sessions&#8221;, summariseExpression = &#8220;sum(Sessions)&#8221;):</strong> This tells R that the calculation we want to use for our pivot table is called “Total Sessions” and we’re using the sum calculation on “Sessions” from our dataset</li>



<li><strong>pt1$addRowDataGroups(&#8220;Landing.page&#8221;): </strong>The first row of our pivot table, the “key”, is our landing page</li>



<li><strong>pt1$addRowDataGroups(&#8220;Session.default.channel.group&#8221;):</strong> Our second column, our drilldown dimension, is our traffic channel</li>



<li><strong>pt1$renderPivot(): </strong>Now we’re using pivottabler’s renderPivot command to turn this to HTML and render it in our viewer window</li>
</ul>



<h3 class="wp-block-heading">Adding Extra Columns To Our R Pivot Table</h3>



<p>If you export directly from GA4, or you add extra metrics to your Google Analytics call (I promise I’ll update part 2 to use the GA4 API soon), you’ll find that you have metrics other than sessions, such as Users and Engagement Time.</p>



<p>Let’s add those to our pivot table as well, summing Users and calculating the Average Engagement Time, so we can see how many users our sessions are driving and how long they’re spending with our content from each channel.</p>



<p>We can do that like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pt2 &lt;- PivotTable$new()
pt2$addData(gaData)
pt2$defineCalculation(calculationName = "Total Sessions", summariseExpression = "sum(Sessions)")
pt2$defineCalculation(calculationName = "Total Users", summariseExpression = "sum(Users)")
pt2$defineCalculation(calculationName = "Average Engagement Time", summariseExpression = "round(mean(Average.engagement.time.per.session), 2)")
pt2$addRowDataGroups("Landing.page")
pt2$addRowDataGroups("Session.default.channel.group")
pt2$renderPivot()
</pre>



<p>And that will give us the following output:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="561" height="505" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-210321.png" alt="Updated R pivot table" class="wp-image-3260" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-210321.png 561w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-210321-300x270.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/Screenshot-2024-02-26-210321-150x135.png 150w" sizes="(max-width: 561px) 100vw, 561px" /></figure>



<h3 class="wp-block-heading">How This Expanded Pivot Table Works</h3>



<p>I won’t re-use all the explanations from the previous section, but let’s take a look at the extra rows we’ve added. For clarity’s sake in the code, I’ve called this one “pt2” rather than “pt1”.</p>



<ul class="wp-block-list">
<li><strong>pt2$defineCalculation(calculationName = &#8220;Total Users&#8221;, summariseExpression = &#8220;sum(Users)&#8221;):</strong> We’ve added an extra column to our pivot table called “Total Users”, which is adding up “Users” from our gaData dataset</li>



<li><strong>pt2$defineCalculation(calculationName = &#8220;Average Engagement Time&#8221;, summariseExpression = &#8220;round(mean(Average.engagement.time.per.session), 2)&#8221;):</strong> This line might look a little intimidating, but all it’s doing is calculating the Average Engagement Time, and then using the round function to take it to two decimal places, which is why it’s wrapped in brackets</li>
</ul>



<p>So there we go. Hopefully that gives you a solid basis to create pivot tables in R.</p>



<p>Let’s look at styling them up so that you can include them in presentations and share with your clients.</p>



<h3 class="wp-block-heading">Adding Styling To Pivot Tables With R</h3>



<p>Finally, let’s create another pivot table using our R code from earlier, but naming it p3 instead of p2. From here, we’ll add another few lines to style our cell backgrounds in dark green (the accent colour from my site).</p>



<p>Here’s how to do that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pt3 &lt;- PivotTable$new()
pt3$addData(gaData)
pt3$defineCalculation(calculationName = "Total Sessions", summariseExpression = "sum(Sessions)")
pt3$defineCalculation(calculationName = "Total Users", summariseExpression = "sum(Users)")
pt3$defineCalculation(calculationName = "Average Engagement Time", summariseExpression = "round(mean(Average.engagement.time.per.session), 2)")
pt3$addRowDataGroups("Landing.page")
pt3$addRowDataGroups("Session.default.channel.group")
pt3$evaluatePivot()

### Add Styling

pt3$setStyling(
  rowNumbers = c(1),  # Target the header row
  columnNumbers = c(4, 5, 6),  # Target the specific columns
  declarations = list("text-align" = "centre", "color" = "#008285")
) 
pt3$renderPivot()
</pre>



<p>Let’s take a look at the extra rows and how the styling works.</p>



<h3 class="wp-block-heading">How Styling In Pivottabler Works</h3>



<p>For the last time today, let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>pt3$setStyling(: </strong>We’re telling R to <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/"   title="apply" data-wpil-keyword-link="linked"  data-wpil-monitor-id="229">apply</a> styling rules</li>



<li><strong>rowNumbers = c(1): </strong>We’re looking at the first row, the header row containing the column names</li>



<li><strong>columnNumbers = c(4, 5, 6)</strong>: These are the columns for &#8220;Total Users&#8221;, &#8220;Average Engagement Time&#8221;, and &#8220;Total Sessions&#8221;</li>



<li><strong>declarations = list(&#8220;text-align&#8221; = &#8220;center&#8221;):</strong> Here, we’re setting the above columns to be centre aligned</li>



<li><strong>&#8220;color&#8221; = &#8220;#F46239&#8221;)):</strong> Finally, we’re setting the accent colour to the same colour as my site’s accent colour – change as appropriate</li>
</ul>



<p>And there we go, a whistlestop tour of using the SEO’s old favourite&nbsp; &#8211; pivot tables &#8211; in R with Google Analytics data and adding some styling to it. Try it yourself, there’s a lot you can do with them.</p>



<h2 class="wp-block-heading">Wrapping Up</h2>



<p>That’s a few common Excel formulas replicated with R, how to create pivot tables and to style them up according to your branding.</p>



<p>I hope you’ll join me next time, where we’ll take a look at using APIs in R. I’m really excited about that piece, and I hope you’ll enjoy reading it and using it as much as I enjoyed writing it.</p>



<p>As always, our R script is below.</p>



<h3 class="wp-block-heading">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Install Packages

library(tidyverse)

library(rvest)

library(pivottabler)

# Read Search Console Queries

gscData &lt;- read.csv("Queries.csv", stringsAsFactors = FALSE)

# If Statement for >= 50 Impressions

gscData$Fifty.Or.More.Impressions &lt;- ifelse(gscData$Impressions >=50, "YES", 
                                            "NO")

# If Else Statement With Multiple Conditions

gscIFELSE1 &lt;- if(gscData$Impressions[1] >=50){
  return("Greater or Equal")
}else{
  return("Less")
}

## If Else Statement Function

gscIFELSEFun &lt;- function(x, y){
  
  if(x >= y){
    return("Greater or Equal")
  }else{
    return("Less")
  }
  
}

gscData$Function &lt;- sapply(gscData$Impressions, gscIFELSEFun, 50)

## Nested If Else Statement Function

gscNestedFun &lt;- function(x, y){
  
  if(x > y){
    return("Greater")
  }else{
    if(x &lt; y){
      return("Less")
    }else{
      if(x == y){
        return("Equal")
      }
    }
  }
}

gscData$NestedFunction &lt;- sapply(gscData$Impressions, gscNestedFun, 50)

# Excel Countif In R

sum(gscData$Impressions >= 50, na.rm=TRUE)

# Excel Sumif In R

sum(gscData$Impressions[gscData$Impressions >= 50])

# Excel Index Match In R

## TV Unit Data &amp; Domain Function From Part 4

tvUnits &lt;- read.csv("tvUnits.csv", stringsAsFactors = FALSE)

domainNames &lt;- function(x){
  
  strsplit(gsub("http://|https://|www\\.", "", x), "/")[[c(1, 1)]]
  
}

tvUnits2 &lt;- data.frame(tvUnits$URL)

names(tvUnits2) &lt;- names(tvUnits["URL"])

tvUnits2$Domain &lt;- sapply(tvUnits2$URL, domainNames)

## Index Match Emulation With Left_Join

tvUnits$Domain &lt;- tvUnits %>% 
  left_join(tvUnits2, by = "URL")

## Index Match Emulation Function Without Dplyr

indexMatch &lt;- function(x, y, z){
  
  row &lt;- match(x, y)
  
  targetVal &lt;- z[row]
  
}

tvUnits$Domain2 &lt;- indexMatch(tvUnits$URL, tvUnits2$URL, tvUnits2$Domain)

# Pivot Tables In R

gaData &lt;- read.csv("gaData.csv", stringsAsFactors = FALSE)

## First Pivot Table

pt1 &lt;- PivotTable$new()
pt1$addData(gaData)
pt1$defineCalculation(calculationName = "Total Sessions", summariseExpression = "sum(Sessions)")
pt1$addRowDataGroups("Landing.page")
pt1$addRowDataGroups("Session.default.channel.group")
pt1$renderPivot()

## Second Pivot Table With Extra Metrics 

pt2 &lt;- PivotTable$new()
pt2$addData(gaData)
pt2$defineCalculation(calculationName = "Total Sessions", summariseExpression = "sum(Sessions)")
pt2$defineCalculation(calculationName = "Total Users", summariseExpression = "sum(Users)")
pt2$defineCalculation(calculationName = "Average Engagement Time", summariseExpression = "round(mean(Average.engagement.time.per.session), 2)")
pt2$addRowDataGroups("Landing.page")
pt2$addRowDataGroups("Session.default.channel.group")#pt2$renderPivot()

## Styling Pivot Table

pt3 &lt;- PivotTable$new()
pt3$addData(gaData)
pt3$defineCalculation(calculationName = "Total Sessions", summariseExpression = "sum(Sessions)")
pt3$defineCalculation(calculationName = "Total Users", summariseExpression = "sum(Users)")
pt3$defineCalculation(calculationName = "Average Engagement Time", summariseExpression = "round(mean(Average.engagement.time.per.session), 2)")
pt3$addRowDataGroups("Landing.page")
pt3$addRowDataGroups("Session.default.channel.group")
pt3$evaluatePivot()
###Add Styling
pt3$setStyling(
  rowNumbers = c(1),  # Target the header row
  columnNumbers = c(4, 5, 6),  # Target the specific columns
  declarations = list("text-align" = "centre", "color" = "#008285")
) 
pt3$renderPivot()</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&amp;linkname=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-5-common-excel-formulas-in-r%2F&#038;title=R%20For%20SEO%20Part%205%3A%20Common%20Excel%20Formulas%20In%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" data-a2a-title="R For SEO Part 5: Common Excel Formulas In R"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO part 4: functions" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 4: Functions</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png" class="attachment-480x480 size-480x480 wp-post-image" alt="" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 2: Packages" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R For SEO Part One | Ben Johnston" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 1: The Basics</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/" class="lwrp-list-link"><img width="360" height="480" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/20210210_002241-scaled.jpg" class="attachment-480x480 size-480x480 wp-post-image" alt="Keyword and Topic Clustering For SEO Using R" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/20210210_002241-scaled.jpg 1920w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/20210210_002241-225x300.jpg 225w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/20210210_002241-768x1024.jpg 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/20210210_002241-113x150.jpg 113w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/20210210_002241-1152x1536.jpg 1152w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/20210210_002241-1536x2048.jpg 1536w" sizes="(max-width: 360px) 100vw, 360px" /><br><span class="lwrp-list-link-title-text">Keyword &#038; Topic Clustering For SEO With R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/sentiment-analysis-for-seo-using-google-sheets/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets.png" class="attachment-480x480 size-480x480 wp-post-image" alt="sentiment analysis for SEO with Google Sheets" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">Sentiment Analysis For SEO Using Google Sheets</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 4: Functions</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Mon, 29 Jan 2024 20:24:46 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">http://167.71.131.91/?p=3203</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/">R For SEO Part 4: Functions</a></p>
<p>Welcome back to part four of my series on using R for SEO. We’re at the halfway point now and hopefully you’re...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/">R For SEO Part 4: Functions</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&#038;title=R%20For%20SEO%20Part%204%3A%20Functions" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" data-a2a-title="R For SEO Part 4: Functions"></a></p>
<p>Welcome back to part four of my series on using <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a>. We’re at the halfway point now and hopefully you’re starting to see the power that the R language can bring to your optimisation and analysis.</p>



<p>Today we’re going to start making it <em>feel</em> like we’re programming – we’re going to be writing our own R functions, looking at the anatomy of a function and creating a few of the most common ones that I use in my SEO work.</p>



<p>I know, again, that this piece is super late. The last eighteen months has been a <strong><em>lot</em></strong>, but hopefully the next few months will give me more time.</p>



<p>As always, shares of this piece are highly appreciated, and if you’d like updates on when I drop new content, please sign up for my FREE email list using the form below.</p>





<p>Ready to go?</p>


<script>(function() {
window.mc4wp = window.mc4wp || {
listeners: [],
forms: {
on: function(evt, cb) {
window.mc4wp.listeners.push(
{
event   : evt,
callback: cb
}
);
}
}
}
})();
</script><!-- Mailchimp for WordPress v4.10.0 - https://wordpress.org/plugins/mailchimp-for-wp/ --><form id="mc4wp-form-6" class="mc4wp-form mc4wp-form-3535" method="post" data-id="3535" data-name="Signup Now" ><div class="mc4wp-form-fields"><p>
    <input type="email" name="EMAIL" placeholder="Your email address" required="">
</p>

<p>
<input type="submit" value="Sign up" />
</p></div><label style="display: none !important;">Leave this field empty if you&#8217;re human: <input type="text" name="_mc4wp_honeypot" value="" tabindex="-1" autocomplete="off" /></label><input type="hidden" name="_mc4wp_timestamp" value="1738006258" /><input type="hidden" name="_mc4wp_form_id" value="3535" /><input type="hidden" name="_mc4wp_form_element_id" value="mc4wp-form-6" /><div class="mc4wp-response"></div></form><!-- / Mailchimp for WordPress Plugin -->


<h2 class="wp-block-heading">R Functions</h2>



<p>Functions in R or any other programming language are basically commands. Every command we’ve used up until this point has included a function of some description. Think of them like Excel formulae.</p>



<p>But the advantage of programming is that we’re not limited to the ones that come in the box – we can create our own. This is where using R starts to <em>feel</em> like it’s real programming rather than just running a series of commands in a terminal, and when I started using it, functions were where it all really started to click.</p>



<p>The beauty of a function is that you can make it variable and, consequently, use it over and over again rather than writing the same commands constantly over the course of your analysis, and you can copy and paste them from one .R file into another and they’ll still work.</p>



<p>An R function can be as simple or as complex as you need it to be. You can make an entire piece of analysis run from one function if you need it to, but there are a few core elements that you should always be aware of.</p>



<h3 class="wp-block-heading">Our First R Function</h3>



<p>Again, an R function can be as simple or as complex as you need it to be, but the anatomy of that function will always have similarities.</p>



<p>Let’s create a really simple function first, and then we’ll break it down.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">firstFun &lt;- function(x,y){
x*y
}
</pre>



<p>This is a very basic function that will multiply one value (x) by another value (y). Truth be told, we don’t need an R function for this, but it’s an easy illustration for our first one.</p>



<p>Paste this function into your console and now you’ll see “firstFun” in your environment explorer under “Functions”, like so:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="422" height="135" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/first-r-function.png" alt="First R function" class="wp-image-3204" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/first-r-function.png 422w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/first-r-function-300x96.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/first-r-function-150x48.png 150w" sizes="(max-width: 422px) 100vw, 422px" /></figure>



<p>Now to run it, we can simply type the following in our console:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">firstFun(5,2)</pre>



<p>Hit return and you’ll see the following output in your console:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="160" height="42" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/first-function-run.png" alt="First R function output" class="wp-image-3206" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/first-function-run.png 160w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/first-function-run-150x39.png 150w" sizes="(max-width: 160px) 100vw, 160px" /></figure>



<p>Nice and simple.</p>



<p>And if we wanted to turn it into an object, we could use the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">firstFunRun &lt;- firstFun(5,2)</pre>



<p>Paste that into your console and you’ll see that firstFunRun is now in your environment.</p>



<p>And again, if you type firstFunRun into your console and hit return, you’ll see the same output as above.</p>



<p>The beauty of using this as a function is the variables (x and y, in this case), meaning we can put any value we like in their place when running the function. Try it yourself.</p>



<p>Now we’ve made our first function, let’s expand it a little.</p>



<h3 class="wp-block-heading">Expanding Our First Function</h3>



<p>Here’s how we could make this function a little bit more varied and incorporate some additional values:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">secondFun &lt;- function(x,y){
  
  val1 &lt;- x*y
  
  val2 &lt;- x+y
  
  val3 &lt;- x-y
  
  val4 &lt;- x/y
  
  output &lt;- data.frame(val1,val2,val3,val4)

}
</pre>



<p>So now we’ve made our function ever so slightly more complex and included a few more values, with the idea of exporting all these permutations of our original calculation in one go. This is where a function would be a bit more useful.</p>



<p>As before, paste this into your console. Now use the same command as before:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">secFunRun &lt;- secondFun(5,2)</pre>



<p>And when we type secFunRun into our console and hit return, we’ll get the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="231" height="61" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/second-r-function-run.png" alt="Second R function running" class="wp-image-3209" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/second-r-function-run.png 231w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/second-r-function-run-150x40.png 150w" sizes="(max-width: 231px) 100vw, 231px" /></figure>



<h3 class="wp-block-heading">Editing Column Names In An R Function</h3>



<p>It’s starting to come together. Now let’s edit our secondFun function to add column headers so we know which value is which.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">secondFun &lt;- function(x,y){
  
  val1 &lt;- x*y
  
  val2 &lt;- x+y
  
  val3 &lt;- x-y
  
  val4 &lt;- x/y
  
  output &lt;- data.frame(val1,val2,val3,val4)
  
  colnames(output) &lt;- c("multiply", "add", "subtract", "divide")
  
  return(output)
  
}</pre>



<p>Paste that into your console and run the same command again, and your output will be updated to show the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="317" height="65" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/second-r-function-run-with-column-names.png" alt="Second R function updated with column names" class="wp-image-3210" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/second-r-function-run-with-column-names.png 317w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/second-r-function-run-with-column-names-300x62.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/second-r-function-run-with-column-names-150x31.png 150w" sizes="(max-width: 317px) 100vw, 317px" /></figure>



<p>As you may have guessed here, we don’t really need that “names” section there, we could’ve just named the elements that way in the first place, but it was just an example to show you how you can rename columns in a data frame in R.</p>



<p>Again, try it yourself a little. Now let’s look at the anatomy of that final R function and how we build them.</p>



<h2 class="wp-block-heading">The Anatomy Of An R Function</h2>



<p>Again, your functions can be as simple or as complex as you need them to be, but there are some common elements:</p>



<ul class="wp-block-list">
<li><strong>secondFun &lt;- function: </strong>Here, we’re telling R that we’re creating something called “secondFun” and it’s a function</li>



<li><strong>(x,y):</strong> The number of variables required by the function. You can add as many as you want, but it’s standard to start with “x”. If it goes beyond “z”, I typically start again with “a”</li>



<li><strong>{}:</strong> When creating an R function, we use the braces to envelop everything contained within that function, all our commands</li>



<li><strong>val1</strong>&#8211;<strong>4:</strong> The commands that we’re including in our function. In this case, we’re multiplying, adding, subtracting and dividing x and y across different commands and creating different datasets in our function’s environment. You can call them whatever you want and run pretty much any command you like</li>



<li><strong>colnames(output): </strong>We’re telling R what the column names in our output frame should be. This isn’t always necessary, but can certainly be handy if you want specific column headers and your data doesn’t automatically support them</li>



<li><strong>output: </strong>An R function always returns the last value it creates, rather than everything in the function, so you have to be specific about what you want the output to be. Typically, you’d use “return” or “output” as your denotation, just for convention</li>
</ul>



<p>You can have a lot of flexibility with your R functions, but following this general anatomy, your functions should work fine.</p>



<p>As we go through the rest of this series, we’ll be using functions quite a lot, and you’ll see that there are a number of different ways they can be invoked, but they’ll all follow this basic layout, so this should be a good grounding.</p>



<p>Now let’s look at certain commands that make our functions reproducible in different projects, and then we’ll start running through a couple of my favourite functions.</p>



<h2 class="wp-block-heading">Using “Require” For Packages To Make R Functions Reproducible</h2>



<p>As you go further on your R journey, you’ll inevitably find yourself re-using functions that you’ve written in other projects. But what if they depend on certain packages?</p>



<p>You can, of course, install those packages at the start of your project, but sometimes, it can be tough to remember which ones you used, which is why we can call them from our function directly. That’s where the “require” command comes in.</p>



<p>The require command does depend on you having installed the package in the past, so it’s not flawless if you’re sharing code, but at least with this notation in your function, you&#8217;ll know which packages you need.</p>



<p>Let’s do a function with our Google Analytics and Search Console commands from <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">part 2</a>, turning them into one dataframe with variable dates so we can re-use it easily.</p>



<p>If we authorised everything properly in the previous sessions, we shouldn’t need to go through that again, but you will still need your Google Analytics View ID to hand.</p>



<h3 class="wp-block-heading">A Google Analytics &amp; Search Console R Function With “Require”</h3>



<p>Here&#8217;s how the function looks:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gaSCR &lt;- function(x, y, z, a){
  
  require(googleAnalyticsR)
  
  require(searchConsoleR)
  
  require(tidyverse)
  
  ga_data &lt;- google_analytics(viewId = x,
                              date_range = c(z, a),
                              metrics = c("sessions", "users", "pageviews", "bouncerate"),
                              dimensions = "date")
  
  sc_data &lt;- search_analytics(site = y,
                              start_date = z,
                              end_date = a,
                              dimensions = c("date"),
                              metrics = c("impressions", "clicks"))
  
  merged_data &lt;- merge(ga_data, sc_data, by = "date", all = TRUE)
  
  merged_data$ctr &lt;- (merged_data$clicks / merged_data$impressions) * 100
  
  return(merged_data)
  
}</pre>



<p>You’ll see at the start, that we’ve put “require” and named the packages that we’ll need for this piece. That means that, if they’re not already loaded in, R will load them whenever you run your function. However, they will need to have been installed previously.</p>



<p>Now we need to set our variables for x, y, z and a – our Google Analytics View ID, our Search Console website URL, our start date and our end date. If you go back to <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">part 2</a>, you’ll find how to do that.</p>



<p>For this example, I’ve used the following:</p>



<ul class="wp-block-list">
<li><strong>x:</strong> view_id &lt;- &#8220;GOOGLE-ANALYTICS-VIEW-ID&#8221;</li>



<li><strong>y:</strong> site_url &lt;- &#8220;YOUR-WEBSITE-URL&#8221;</li>



<li><strong>z:</strong> start_date &lt;- &#8220;YOUR-START-DATE&#8221;</li>



<li><strong>a:</strong> end_date &lt;- &#8220;YOUR-END-DATE&#8221;</li>
</ul>



<p>Now paste that into your console and follow it up with:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">googleData &lt;- gaSCR(view_id, site_url, start_date, end_date)</pre>



<p>And there we go, a really handy R function for pulling Google Analytics and Search Console data into one dataframe with variable dates. Try it yourself and feel free to change it to include different metrics and dimensions.</p>



<h3 class="wp-block-heading">How The Merged Google Analytics &amp; Google Search Console Function Works</h3>



<p>Let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>gaSCR &lt;- function(x, y, z, a){:</strong> As always, we’re creating our function, naming it gaSCR and our variables are x, y, z and a</li>



<li><strong>require:</strong> Here, we’re telling R that there are packages required for this function to run, googleAnalyticsR, searchConsoleR and tidyverse, in this case. These will be packages you’ve installed before, if you’ve been following along, but if not, you’ll need to use the install.packages command for them</li>



<li><strong>ga_data &lt;- google_analytics(viewId = x:</strong> We’re telling R to create an object called ga_data and it’s coming from the google_analytics function in googleAnalyticsR, focusing on our viewID variable which is x in this function</li>



<li><strong>date_range = c(z, a): </strong>As we saw in part 2, we’re invoking the date_range element of our Google Analytics call. We are combining our dates into our start date (z) and our end date (a)</li>



<li><strong>metrics = c(&#8220;sessions&#8221;, &#8220;users&#8221;, &#8220;pageviews&#8221;, &#8220;bouncerate&#8221;):</strong> We’re telling R that the metrics we want from Google Analytics are Sessions, Users, Pageviews and Bounce Rate. You can change these to your own requirements</li>



<li><strong>dimensions = &#8220;date&#8221;): </strong>The last part of our Google Analytics call is to split our data by dimension. Date, in this case</li>



<li><strong>sc_data &lt;- search_analytics(site = y,: </strong>Now we’re starting our Google Search Console data pull, and our site URL is being invoked with y</li>



<li><strong>start_date = z, end_date = a:</strong> As with our Google Analytics call, we’re invoking our date range variables in our Google Search Console data</li>



<li><strong>dimensions = c(&#8220;date&#8221;):</strong> We’re splitting our Google Search Console data by date, in the same way we did our Google Analytics data</li>



<li><strong>metrics = c(&#8220;impressions&#8221;, &#8220;clicks&#8221;)): </strong>The metrics that we want from Google Search Console are impressions and clicks. That ends our Google Search Console call</li>



<li><strong>merged_data &lt;- merge(ga_data, sc_data, by = &#8220;date&#8221;, all = TRUE):</strong> We’re using the merge function from the dply package in the tidyverse to combine our Google Analytics and Search Console datasets with the date being the anchor</li>



<li><strong>merged_data$CTR &lt;- (merged_data$clicks / merged_data$impressions) * 100:</strong> Now we’re creating another column in our data which calculates click-through rate percentage</li>



<li><strong>output &lt;- merged_data: </strong>Finally, we’re sending all this data to our output dataframe</li>
</ul>



<p>Now we can see how to merge a Google Analytics and Google Search Console dataset within a singular dataframe. Pretty handy, right?</p>



<p><strong>Note:</strong> This is still using Universal Analytics. I will be updating all these posts to utilise the GA4 API in the near future.</p>



<p>Now let’s look at a couple of my other favourite R functions that I find myself using a lot.</p>



<h2 class="wp-block-heading">Merging Multiple CSV Files In R</h2>



<p>This was one of the first functions I worked with outside of a course. It’s pretty old now, and I’m sure there are better ways to do it, but it still works reliably for merging multiple CSV files and getting them into your environment in R.</p>



<p>First, make sure your CSV files are all in one dedicated folder, preferably a subfolder of your project directory. You can navigate to them in RStudio through the “Files” pane, like so:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="578" height="445" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/navigate-files-pane-rstudio.png" alt="Navigating files with RStudio" class="wp-image-3221" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/navigate-files-pane-rstudio.png 578w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/navigate-files-pane-rstudio-300x231.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/navigate-files-pane-rstudio-150x115.png 150w" sizes="(max-width: 578px) 100vw, 578px" /></figure>



<p>Now we need to tell R to switch our working directory to that, which we can do like so, using RStudio:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="642" height="357" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/set-working-directory-rstudio.png" alt="Set working directory in RStudio" class="wp-image-3222" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/set-working-directory-rstudio.png 642w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/set-working-directory-rstudio-300x167.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/set-working-directory-rstudio-150x83.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/set-working-directory-rstudio-630x350.png 630w" sizes="(max-width: 642px) 100vw, 642px" /></figure>



<p>Alright, now we’re where we need to be, here’s the function I use to merge multiple CSV files in R and create a dataframe from them.</p>



<p>Before we continue, it’s <em>vital</em> that the CSV files all have the same headers. There can be different amounts of data in there, but if they’ve got different headers, this function won’t work. That said, I’ve found it really handy over the years for dealing with multiple exports and so on. Assuming all that’s true, here’s how to run it.</p>



<p>Paste the following into your console:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">csvMerge &lt;- function(x){
  
  require(plyr)
  
  csvFiles &lt;- dir(pattern = x, full.names=TRUE)
  
  output &lt;- ldply(csvFiles, read.csv)
  
}</pre>



<p>Now run it like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">csvFiles &lt;- csvMerge("csv$")</pre>



<p>Depending on how many files and how big they are, it may take a few seconds, but it’s a lot quicker than copying and pasting in Excel! Now you should have a dataframe called “csvFiles” in your R environment like so:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="481" height="141" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/merged-csv-files-with-r.png" alt="Merged CSV files with R" class="wp-image-3223" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/merged-csv-files-with-r.png 481w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/merged-csv-files-with-r-300x88.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/merged-csv-files-with-r-150x44.png 150w" sizes="(max-width: 481px) 100vw, 481px" /></figure>



<p>Let’s break down how it works:</p>



<ul class="wp-block-list">
<li><strong>csvMerge &lt;- function (x){:</strong> We’re telling R that we’re creating a new object called csvMerge, it’s a function and there is one variable utilised called x. The opening brace is what the function’s commands will be contained in</li>



<li><strong>require(plyr):</strong> Our function has dependencies from the <a href="https://cran.r-project.org/web/packages/plyr/index.html" target="_blank">plyr package</a>. If you don’t already have it installed, you’ll want to use the install.packages(“plyr”) command to get it installed. If you do already have it installed, this command will initialise it</li>



<li><strong>csvFiles &lt;- dir(pattern = x, full.names=TRUE):</strong> Here, we’re telling R to look through the current working directory for files that match the pattern of our x variable, and that it should look at the full name of it. It should then create a csvFiles object with all of these names. In this case, our x variable is “csv$”, which means we’re looking for the end of the file to be called “csv”</li>



<li><strong>output &lt;- ldply(csvFiles, read.csv)}:</strong> Our output is to use the <a href="https://search.r-project.org/CRAN/refmans/plyr/html/ldply.html" target="_blank">ldply</a> function from the plyr package to apply our chosen function (read.csv, in this case) across our list (csvFiles) and merge the results into a single dataframe. The closing brace says our function is complete</li>
</ul>



<p>And there you have it: a quick and easy way to merge multiple CSV files in R. Even if you didn’t need them for analysis and just needed them all in one file, you can still use this function and then export it like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">write.csv(csvFiles, “merged csvs.csv”)</pre>



<p>This will export it all into one CSV file. I can’t tell you how handy this function has been to me over the years, so hopefully it’ll help you too.</p>



<h2 class="wp-block-heading">Download XML Sitemaps &amp; Check Status Codes In R</h2>



<p>Obviously, there are better ways to do this, using tools like <a href="https://sitebulb.com/" target="_blank">Sitebulb</a> or <a href="https://www.screamingfrog.co.uk/seo-spider/" target="_blank">Screaming Frog</a>, but as SEOs, I’m sure we’ve all experienced the delays and challenges of software or getting IT departments to actually let us install them, which is why I created this little piece a while back.</p>



<p>This function downloads the XML sitemap from the target URL, scrapes the URLs from it using the <a href="https://rvest.tidyverse.org/" target="_blank">Rvest package</a> (another Hadley Wickham creation), and checks the HTTP status code of them using the <a href="https://cran.r-project.org/web/packages/httr/index.html" target="_blank">httr package</a>. Although sitemaps aren’t the be-all and end-all from an SEO perspective, it’s always worthwhile to have a clean one, free of redirects or broken links. Here’s how you can check your XML sitemap with R:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sitemapTestR &lt;- function(x){
  
  require(rvest)
  
  require(httr)
  
  sitemap_html &lt;- content(GET(x), "text")
  
  sitemap &lt;- read_html(sitemap_html)
  
  urls &lt;- sitemap %>% html_nodes("loc") %>% 
    html_text()
  
  results &lt;- data.frame(url = character(), status_code = integer(),
                        stringsAsFactors = FALSE)
  
  for (url in urls) {
    response &lt;- GET(url)
    status_code &lt;- status_code(response)    
    results &lt;- rbind(results, data.frame(url = url, status_code = status_code, 
                                         stringsAsFactors = FALSE))
  }
  
  output &lt;- results
  
}</pre>



<p>It’s not the smallest of functions, but it works well.</p>



<h3 class="wp-block-heading">How The SitemapTestR Function Works</h3>



<p>Since this function is a bit of a beast and has a couple of elements we’ve not covered yet, I thought it was worth giving it its own section. As always, let’s break it down:</p>



<ul class="wp-block-list">
<li><strong>sitemapTestR &lt;- function(x){:</strong> As before, we’re naming our function – sitemapTestR in this case (see what I did there?), it’s got a single variable called “x” and we’re opening our braces to contain our commands</li>



<li><strong>require(:</strong> Again, we’re telling R that this function depends on the httr and rvest packages</li>



<li><strong>sitemap_html &lt;- content(GET(x), &#8220;text&#8221;):</strong> We’re using functions from the httr package to run a <a href="https://www.w3schools.com/tags/ref_httpmethods.asp" target="_blank">GET request</a> to request data from the server – the contents of the sitemap in this case, and we’re storing it as a text object</li>



<li><strong>sitemap &lt;- read_html(sitemap_html):</strong> This part uses the read_html function of the rvest package to take the text we’ve just downloaded and turn it into html</li>



<li><strong>urls &lt;- sitemap %&gt;% html_nodes(&#8220;loc&#8221;) %&gt;%&nbsp; html_text():</strong> One of the great things about the <a href="https://www.tidyverse.org/" target="_blank">Tidyverse</a> and its associates is the ability to chain commands using %&gt;%. In this example, we’re using it to say we want to create an object called “urls” using our “sitemap” object as the dataset, we’re using rvests html_nodes function (a command that pulls from the “nodes” of our sitemap – the markers called “loc”, which are the URLs in the XML sitemap) and take only the text within, which is what the html_text command does</li>



<li><strong>results &lt;- data.frame(url = character(), status_code = integer(), stringsAsFactors = FALSE):</strong> This creates an empty dataframe called “results”, with the headers “url” and “status_code”, which are a character and integer respectively. We don’t want our strings to be seen as factors</li>



<li><strong>for (url in urls){:</strong> Here’s where we start to have some fun. We’re nesting a for <a href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/" data-wpil-monitor-id="207">loop</a> inside our function. I’ve used loops in a couple of my other <a href="https://www.ben-johnston.co.uk/category/r/">R language</a> posts, but this is the first time we’ve used one in this series. There will be more to follow on loops in a later post, but here, we’re starting our loop with for, saying “for each url in the urls object, perform the commands below”. Similar to a function, these commands are contained in braces</li>



<li><strong>response &lt;- GET(url):</strong> Using the httr packages’ GET request function, we’re putting the URL into an object called “response”</li>



<li><strong>status_code &lt;- status_code(response): </strong>With our trusty httr package, our URLs status code is tested and added to an object called “status_code”</li>



<li><strong>results &lt;- rbind(results, data.frame(url = url, status_code = status_code, stringsAsFactors = FALSE))}:</strong> Finally, the result of this test is added to our results dataframe with the URL entering the url column and the status code going into the status_code column, with no strings being factors. The rbind command says to merge them into a single frame as our loop runs these commands over every URL in our urls object and our closing brace finishes the loop</li>



<li><strong>output &lt;- results}:</strong> And we’re finally finished with our function. Our output is the output frame from our results element of the loop and our closing brace finishes the function</li>
</ul>



<p>Phew! I told you it was a beast of a function, but this is part of why we would use a function rather than interactive code. If you had a situation where you had to test several sitemaps, this could amount to hundreds of lines of code and hours of work, whereas this function allows you to run the same commands in a lot less time and with a lot less code.</p>



<h3 class="wp-block-heading">Running The SitemapTestR Function</h3>



<p>To run it, we need to create our x variable, which is the URL of our sitemap. Let’s use my posts sitemap as an example:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sitemap_url &lt;- "https://www.ben-johnston.co.uk/post-sitemap.xml"</pre>



<p>And to run it, we would use the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sitemapStatus &lt;- sitemapTestR(sitemap_url)</pre>



<p>Give it a few seconds to run, and you’ll see the following in your environment:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="513" height="198" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/sitemap-tester-in-r.png" alt="XML sitemap tester in R" class="wp-image-3228" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/sitemap-tester-in-r.png 513w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/sitemap-tester-in-r-300x116.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/sitemap-tester-in-r-150x58.png 150w" sizes="(max-width: 513px) 100vw, 513px" /></figure>



<p>Explore it with the str(sitemapStatus) command, and you should see the below:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="904" height="125" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-sitemap-test-structure.png" alt="XML sitemap structure in R" class="wp-image-3229" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-sitemap-test-structure.png 904w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-sitemap-test-structure-300x41.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-sitemap-test-structure-150x21.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-sitemap-test-structure-768x106.png 768w" sizes="(max-width: 904px) 100vw, 904px" /></figure>



<p>And of course, you can export it to csv, with our trusty write.csv command.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">write.csv(sitemapStatus, "sitemapStatus.csv")</pre>



<p>And there we have it. Your XML sitemap tested in an R function. It’s certainly saved me some hassle over the years, and I hope it helps you too.</p>



<h2 class="wp-block-heading">Strip URLs To Domains In R</h2>



<p>This is a function that I use quite a lot, especially when doing link or competitor analysis using APIs in R (which we’ll cover in a couple of pieces time).</p>



<p>With this function, we take a URL from another column and strip it down to the domain using a very simple regular expression. This function has served me well for many years and uses an sapply method to run through a series of URLs. The function is as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">domainNames &lt;- function(x){
  
  strsplit(gsub("http://|https://|www\\.", "", x), "/")[[c(1, 1)]]
  
}</pre>



<p>But we need some test data to run it on. If you go <a href="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/tvUnits.csv">here</a>, you’ll find a spreadsheet with the top-ranking URLs for the term “TV Units” from <a href="https://seranking.com/?ga=2640572&amp;source=link" target="_blank">SE Ranking</a>, which you can use for a test.</p>



<h3 class="wp-block-heading">Stripping URLs To Domains With Our Test Data</h3>



<p>Firstly, download the test data above and place it in your project folder.</p>



<p>Now read it in using the read.csv command as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tvUnits &lt;- read.csv(“tvUnits.csv”, stringsAsFactors = FALSE)</pre>



<p>As I say, we’ll be covering how to pull this in directly from <a href="https://seranking.com/?ga=2640572&amp;source=link" target="_blank">SE Ranking</a> in a couple of weeks, but for now, this will work.</p>



<p>Now, after you’ve read your function into your R console, you can use the following command to strip the ranking URLs to domains in a new column in your dataset:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tvUnits$Domain &lt;- sapply(tvUnits$URL, domainNames)</pre>



<p>If we look at it using the str() command, we’ll see the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="918" height="301" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/urls-to-domains-r-function.png" alt="URLs to domains in R" class="wp-image-3235" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/urls-to-domains-r-function.png 918w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/urls-to-domains-r-function-300x98.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/urls-to-domains-r-function-150x49.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/urls-to-domains-r-function-768x252.png 768w" sizes="(max-width: 918px) 100vw, 918px" /></figure>



<p>And there we have it. URLs stripped to domains using a fairly simple R function with no additional packages.</p>



<p>We’ll be talking about sapply and other <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/"   title="apply" data-wpil-keyword-link="linked"  data-wpil-monitor-id="228">apply</a> methods in R in a few weeks, but for now, this means a “simple apply” and isn’t too dissimilar to a loop, in that it applies our function to every element in our list of URLs.</p>



<h3 class="wp-block-heading">How It Works</h3>



<p>While this function isn’t as large as our sitemap tester, there are still a couple of areas that we’ve not covered yet, so again, worth having its own section.</p>



<p>Let’s break it down.</p>



<ul class="wp-block-list">
<li><strong>domainNames &lt;- function(x){: </strong>As always, we’re creating a new function called domainNames and it has a variable called x</li>



<li><strong>strsplit(gsub(: </strong>We’re using the strsplit and gsub functions from base R. Strsplit – or “string split” means to break up a part of a string of text or numbers (our URL in this case) and gsub is effectively R’s way of invoking the classic find and replace</li>



<li><strong>&#8220;http://|https://|www\\.&#8221;, &#8220;&#8221;, x), &#8220;/&#8221;)[[c(1, 1)]]}:</strong> This intimidating-looking piece is actually a relatively simple regular expression which looks for everything before or after the domain name, including the http(s)://, the optional www. and everything after the domain name. Regular expressions are a great addition to any SEO’s skillset, either in your R journey or in general as they can be used for so many different elements. If you’d like to learn more about them, I love using the <a href="https://www.therobinlord.com/projects/slash-escape" target="_blank">Slash\Escape game by Robin Lord</a> as a training exercise</li>
</ul>



<p>And that’s how you strip URLs to domains using R. Simple, elegant and very quick to run. It’s been very useful for me over the years and I hope it helps you too.</p>



<h2 class="wp-block-heading">Wrapping Up</h2>



<p>We’ve been on quite a journey today, haven’t we? Hopefully this has given you a good introduction to using R functions, particularly for SEO. There’re going to be quite a lot of functions in the next half of this series, so it’s worth becoming familiar with them.</p>



<p>I hope you’ve found this useful and that you’ll join me next time for <a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" data-wpil-monitor-id="210">part 5 where we’ll start replicating common Excel formulae</a> in R. I promise this one won’t take me as long to post!</p>



<h3 class="wp-block-heading">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># First Function

firstFun &lt;- function(x,y){
  
  x*y

}

firstFun(5,2)

## First Function As An Object

firstFunRun &lt;- firstFun(5,2)

# Second Function

secondFun &lt;- function(x,y){
  
  val1 &lt;- x*y
  
  val2 &lt;- x+y
  
  val3 &lt;- x-y
  
  val4 &lt;- x/y
  
  output &lt;- data.frame(val1,val2,val3,val4)

}

secFunRun &lt;- secondFun(5,2)

## Add Column Names To Second Function

secondFun &lt;- function(x,y){
  
  val1 &lt;- x*y
  
  val2 &lt;- x+y
  
  val3 &lt;- x-y
  
  val4 &lt;- x/y
  
  output &lt;- data.frame(val1,val2,val3,val4)
  
  colnames(output) &lt;- c("multiply", "add", "subtract", "divide")
  
  return(output)
  
}

secFunRun &lt;- secondFun(5,2)

# Google Analytics &amp; Search 

ga_auth()

scr_auth()

gaSCR &lt;- function(x, y, z, a){
  
  require(googleAnalyticsR)
  
  require(searchConsoleR)
  
  require(tidyverse)
  
  ga_data &lt;- google_analytics(viewId = x,
                              date_range = c(z, a),
                              metrics = c("sessions", "users", "pageviews", "bouncerate"),
                              dimensions = "date")
  
  sc_data &lt;- search_analytics(site = y,
                              start_date = z,
                              end_date = a,
                              dimensions = c("date"),
                              metrics = c("impressions", "clicks"))
  
  merged_data &lt;- merge(ga_data, sc_data, by = "date", all = TRUE)
  
  merged_data$ctr &lt;- (merged_data$clicks / merged_data$impressions) * 100
  
  return(merged_data)
  
}

## Using GA/ GSC Function

view_id &lt;- "GOOGLE-ANALYTICS-VIEW-ID"
site_url &lt;- "YOUR-WEBSITE-URL"
start_date &lt;- "YOUR-START-DATE"
end_date &lt;- "YOUR-END-DATE"

googleData &lt;- gaSCR(view_id, site_url, start_date, end_date)

# Merge Multiple CSV Files

csvMerge &lt;- function(x){
  
  require(plyr)
  
  csvFiles &lt;- dir(pattern = x, full.names=TRUE)
  
  output &lt;- ldply(csvFiles, read.csv)
  
}

csvFiles &lt;- csvMerge("csv$")

mergedCSV &lt;- write.csv(csvFiles, "merged csvs.csv")

# Scrape &amp; Test XML Sitemaps

sitemapTestR &lt;- function(x){
  
  require(rvest)
  
  require(httr)
  
  sitemap_html &lt;- content(GET(x), "text")
  
  sitemap &lt;- read_html(sitemap_html)
  
  urls &lt;- sitemap %>% html_nodes("loc") %>% 
    html_text()
  
  results &lt;- data.frame(url = character(), status_code = integer(),
                        stringsAsFactors = FALSE)
  
  for (url in urls) {
    response &lt;- GET(url)
    status_code &lt;- status_code(response)    
    results &lt;- rbind(results, data.frame(url = url, status_code = status_code, 
                                         stringsAsFactors = FALSE))
  }
  
  output &lt;- results
  
}

sitemap_url &lt;- "https://www.ben-johnston.co.uk/post-sitemap.xml"

sitemapStatus &lt;- sitemapTestR(sitemap_url)

write.csv(sitemapStatus, "sitemapStatus.csv")

# Strip URLs To Domain Names

domainNames &lt;- function(x){
  
  strsplit(gsub("http://|https://|www\\.", "", x), "/")[[c(1, 1)]]
  
}

tvUnits &lt;- read.csv("tvUnits.csv", stringsAsFactors = FALSE)

tvUnits$domain &lt;- sapply(tvUnits$URL, domainNames)

write.csv(tvUnits, "tvUnitsDomain.csv")</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&amp;linkname=R%20For%20SEO%20Part%204%3A%20Functions" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-4-functions%2F&#038;title=R%20For%20SEO%20Part%204%3A%20Functions" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" data-a2a-title="R For SEO Part 4: Functions"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png" class="attachment-480x480 size-480x480 wp-post-image" alt="" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 2: Packages" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R For SEO Part One | Ben Johnston" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 1: The Basics</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 3: Data Visualisation With GGPlot2 &amp; Wordcloud</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Mon, 25 Apr 2022 02:39:00 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">http://167.71.131.91/?p=3063</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</a></p>
<p>Welcome back. You’ve made it to part three, where we’re going to start having a bit of fun with the R language...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&#038;title=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" data-a2a-title="R For SEO Part 3: Data Visualisation With GGPlot2 &amp; Wordcloud"></a></p>
<p>Welcome back. You’ve made it to part three, where we’re going to start having a bit of fun with the <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R language and SEO</a> data. Hopefully the first couple of parts gave you a bit of a grounding in the basics, <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">how to use R packages</a> and <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">how to get Google Analytics and Search Console data in R</a>. Today, we’re going to do some simple visualisation work on that data using the GGPlot2 package.&nbsp;</p>



<p>In today’s piece, we’re going to use the GGPlot2 package and some Google Analytics and Search Console data to make basic bar charts, line graphs, a combo plot and finally a really cool wordcloud with our search queries.</p>



<p>I know today&#8217;s piece is super late. Work has been crazy and I&#8217;ve been moving house, which comes with its own fun challenges.</p>



<p>Ready to get started? Alright, let’s do this.</p>



<div class="wp-block-advanced-gutenberg-blocks-summary"><p class="wp-block-advanced-gutenberg-blocks-summary__title">Contents</p><div class="wp-block-advanced-gutenberg-blocks-summary__fold"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-chevron-up"><polyline points="18 15 12 9 6 15"></polyline></svg></div><ol role="directory" class="wp-block-advanced-gutenberg-blocks-summary__list"><li><a href="https://www.ben-johnston.co.uk#the-ggplot2-package-and-the-tidyverse">The GGPlot2 Package And The Tidyverse</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#installing-our-packages">Installing Our Packages&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#our-datasets">Our Datasets&nbsp;</a><ol><li><a href="https://www.ben-johnston.co.uk#google-analytics-datasets-for-our-visualisations">Google Analytics Datasets For Our Visualisations</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#google-search-console-datasets">Google Search Console Datasets</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#ggplot2-for-seo">GGPlot2 For SEO&nbsp;</a><ol><li><a href="https://www.ben-johnston.co.uk#a-basic-ggplot2-line-graph-with-google-analytics-data">A Basic GGPlot2 Line Graph With Google Analytics Data&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#creating-comparison-line-graphs-with-google-analytics-data-in-r">Creating Comparison Line Graphs With Google Analytics Data In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#building-bar-charts-with-search-console-data-in-ggplot2">Building Bar Charts With Search Console Data In GGPlot2&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#building-a-combo-chart-of-google-search-console-impressions-amp;-clicks-in-ggplot2">Building A Combo Chart Of Google Search Console Impressions &amp; Clicks In GGPlot2</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#creating-wordclouds-in-r-using-google-search-console-data">Creating Wordclouds In R Using Google Search Console Data</a><ol><li><a href="https://www.ben-johnston.co.uk#google-search-console-query-data-in-r">Google Search Console Query Data In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#creating-amp;-cleaning-a-text-corpus-in-r">Creating &amp; Cleaning A Text Corpus In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#why-we-clean-text-corpuses">Why We Clean Text Corpuses</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#making-a-corpus-lowercase-in-r">Making A Corpus Lowercase In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#removing-punctuation-in-an-r-corpus">Removing Punctuation In An R Corpus</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#removing-stopwords-in-an-r-corpus">Removing Stopwords In An R Corpus</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#stemming-words-in-an-r-corpus">Stemming Words In An R Corpus</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#our-wordcloud-with-google-search-console-query-data-in-r">Our Wordcloud With Google Search Console Query Data In R&nbsp;</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#exporting-data-visualisations-in-rstudio">Exporting Data Visualisations In RStudio</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#wrapping-up">Wrapping Up</a><ol><li><a href="https://www.ben-johnston.co.uk#our-code-from-today">Our Code From Today</a><ol></ol></li></ol></li></ol></div>


<script>(function() {
window.mc4wp = window.mc4wp || {
listeners: [],
forms: {
on: function(evt, cb) {
window.mc4wp.listeners.push(
{
event   : evt,
callback: cb
}
);
}
}
}
})();
</script><!-- Mailchimp for WordPress v4.10.0 - https://wordpress.org/plugins/mailchimp-for-wp/ --><form id="mc4wp-form-7" class="mc4wp-form mc4wp-form-3535" method="post" data-id="3535" data-name="Signup Now" ><div class="mc4wp-form-fields"><p>
    <input type="email" name="EMAIL" placeholder="Your email address" required="">
</p>

<p>
<input type="submit" value="Sign up" />
</p></div><label style="display: none !important;">Leave this field empty if you&#8217;re human: <input type="text" name="_mc4wp_honeypot" value="" tabindex="-1" autocomplete="off" /></label><input type="hidden" name="_mc4wp_timestamp" value="1738006259" /><input type="hidden" name="_mc4wp_form_id" value="3535" /><input type="hidden" name="_mc4wp_form_element_id" value="mc4wp-form-7" /><div class="mc4wp-response"></div></form><!-- / Mailchimp for WordPress Plugin -->


<h2 class="wp-block-heading" id="the-ggplot2-package-and-the-tidyverse">The GGPlot2 Package And The Tidyverse</h2>



<p>In <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">part 2</a>, we discussed the Tidyverse series of packages and how great it is. GGPlot2 is part of that package, so we’ll be initialising it that way.&nbsp;</p>



<p>GGPlot2 is the industry standard for visualising data using R and, truthfully, I’ve not come across a better package for this purpose. There are debates between GGPlot2 and MatPlotLib for Python, but personally, I tend to skew towards GGPlot2’s Python equivalent <a href="https://plotnine.readthedocs.io/en/stable/" target="_blank" rel="noreferrer noopener">plotnine</a> since it uses the same Grammar of Graphics approach and it’s just familiar when I use Python.&nbsp;</p>



<p>If you feel you’re going to be doing a lot of data visualisation during your R journey, it’s really worth getting familiar with the concepts behind the <a href="https://towardsdatascience.com/a-comprehensive-guide-to-the-grammar-of-graphics-for-effective-visualization-of-multi-dimensional-1f92b4ed4149" target="_blank" rel="noreferrer noopener">Grammar of Graphics</a>, so there’s some further reading for you. I’m not going to go into it here, today’s all about the mechanics of using GGPlot2 with <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" data-wpil-monitor-id="214">Google Analytics and Search Console</a> data so we can make some SEO-specific graphs.&nbsp;</p>



<p>First, let’s start by installing the packages we’re going to use today.</p>



<h2 class="wp-block-heading" id="installing-our-packages">Installing Our Packages&nbsp;</h2>



<p>Today, we’re going to use the following packages:&nbsp;</p>



<ul class="wp-block-list">
<li><strong>Tidyverse:</strong> As mentioned before, GGPlot2 is included in this, but it’s always worth having the other packages installed as well </li>



<li><strong>googleAnalyticsR: </strong>We ran through this in the last part, but this is the standard package for pulling Google Analtyics data directly into R </li>



<li><strong>googleSearchConsoleR:</strong> Again, we covered this last time, but this is how we get Google Search Console data into our R environment </li>



<li><strong>wordcloud:</strong> The standard package for creating wordclouds using R </li>



<li><strong>tm: </strong>My favourite package for text mining and cleaning  </li>
</ul>



<p>Now we know which packages we’re going to use for this exercise, let’s try and be efficient and install them all at once.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">instPacks &lt;- c("tidyverse", "googleAnalyticsR", "searchConsoleR", "googleAuthR", "tm", "wordcloud")

lapply(instPacks, require, character.only = TRUE)</pre>



<p>Now we’ve got our packages installed, follow the authorisation steps from the previous piece and we can start getting some SEO data to visualise.</p>



<h2 class="wp-block-heading" id="our-datasets">Our Datasets&nbsp;</h2>



<p>We’re going to create four initial datasets for our first graphs:&nbsp;</p>



<ul class="wp-block-list">
<li><strong>Organic sessions over last 30 days:</strong> We’ll use this to build our very first line graph </li>



<li><strong>Organic sessions over last 30 days compared to previous period: </strong>Our second graph will have two lines, comparing the last 30 days to the previous 30 days </li>



<li><strong>Impressions over last 30 days:</strong> We’ll use this for our first bar chart </li>



<li><strong>Impressions &amp; clicks over last 30 days:</strong> This will build our first combo chart with dual axes </li>
</ul>



<p>Let’s get our Google Analytics datasets for our initial visualisations.</p>



<h3 class="wp-block-heading" id="google-analytics-datasets-for-our-visualisations">Google Analytics Datasets For Our Visualisations</h3>



<p>As we covered in the <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">last part</a>, let’s pull some basic Google Analytics data using the organic segment and breaking it down by date. Remember to go through the authorisation steps and to replicate the View ID/ segment identification stages for your own data. I&#8217;ve replaced my IDs with X for the purposes of this piece.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ga_auth()

gaAccounts &lt;- ga_account_list()

viewID &lt;- gaAccounts$viewId[X]

orgSegment &lt;- segment_ga4("orgSegment", segment_id = "gaid::-X")

ga30Days &lt;- google_analytics(viewID, date_range =c("30DaysAgo","yesterday"), metrics = c("sessions"), dimensions = "date",segment= orgSegment)</pre>



<p>This is the basic one that we’ll use for our very first visualisation, but we’ll also want to use a comparison range for our second, so let’s get that data as well.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GADataComparison &lt;- google_analytics(viewID, date_range =c("60DaysAgo", "31DaysAgo", "30DaysAgo","yesterday"), metrics = c("sessions"), dimensions = "date", segment= orgSegment)</pre>



<p>Now we’ve got everything we need from Google Analytics, let’s get the datasets we need from Google Search Console to visualise.</p>



<h3 class="wp-block-heading" id="google-search-console-datasets">Google Search Console Datasets</h3>



<p>First, we want to get our data split out over the last 30 days. With the way googleSearchConsoleR works, we can get it all in one call, which is pretty handy, considering the graphs we want to make from it. We just need the following commands, including the authorisation steps we talked about in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">the last piece</a>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">scr_auth()

scDataByDate &lt;- search_analytics("YOUR SITE", startDate = Sys.Date() -30, Sys.Date() -1, searchType = "web", dimensions = "date")</pre>



<p>This call will actually give use what we need for both the Google Search Console graphs we’re going to build, so this is nice and easy.&nbsp;Obviously replace &#8220;YOUR SITE&#8221; with your site, but don&#8217;t forget the speech marks.</p>



<p>OK, now we’ve finally got our data, let’s start building some graphs.</p>



<h2 class="wp-block-heading" id="ggplot2-for-seo">GGPlot2 For SEO&nbsp;</h2>



<p>GGPlot2 is a very powerful and flexible graphing library, and you can make pretty much any data visualisation you can imagine with it, once you learn how it works. <a href="http://www.datavisualisation-r.com/" target="_blank" rel="noreferrer noopener">Data Visualisation With R</a> by Thomas Rahlf is a fantastic resource to expand on this, if data visualisation is your jam.</p>



<h3 class="wp-block-heading" id="a-basic-ggplot2-line-graph-with-google-analytics-data">A Basic GGPlot2 Line Graph With Google Analytics Data&nbsp;</h3>



<p>Firstly, we want to use our Google Analytics dataframe to give us a simple line graph of the last 30 days of organic search data. This will be our first GGPlot2 visualisation command in R, so I’ll break it down afterwards.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ggplot(data=ga30Days, aes(x=date, y=sessions)) +
  geom_line(colour="darkgreen", stat="identity") + xlab("Date") + ylab("Sessions")</pre>



<p>This gives us the following graph:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="615" height="455" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicLineChartGGPlot.png" alt="Basic Google Analytics line chart with ggplot2" class="wp-image-3082" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicLineChartGGPlot.png 615w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicLineChartGGPlot-300x222.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicLineChartGGPlot-150x111.png 150w" sizes="(max-width: 615px) 100vw, 615px" /></figure>



<p>We’ll talk about how to export GGPlot2 graphs from RStudio later, but for now, let’s break that command down so you can see how we put this graph together.</p>



<ul class="wp-block-list">
<li><strong>ggplot(data=ga30Days, aes(x=date, y=sessions))</strong>: We&#8217;re invoking the ggplot command, telling it what data we want to use (our ga30Days set in this case) and telling it which axes it should use</li>



<li><strong>+</strong>: This is how we break up a command across multiple lines in R so that we can get it all on one screen</li>



<li><strong>geom_line(colour=&#8221;darkgreen&#8221;: </strong>We&#8217;re using the line graph method of visualising this chart and we&#8217;re using dark green for the colour. In the EN-US language, you&#8217;d be using &#8220;color&#8221;</li>



<li><strong>, stat=&#8221;identity&#8221;)</strong>: We&#8217;re basing the scale on the numbers of the dataset</li>



<li><strong>+ xlab(&#8220;Date&#8221;) + ylab(&#8220;Sessions&#8221;)</strong>: We&#8217;re labelling the X and Y axes according to our datasets. &#8220;Date&#8221; and &#8220;Sessions&#8221; in this case</li>
</ul>



<p>Now we’ve got a bit of an understanding of how to put a chart together in GGPlot2, let’s start expanding them with a secondary series.&nbsp;</p>



<h3 class="wp-block-heading" id="creating-comparison-line-graphs-with-google-analytics-data-in-r">Creating Comparison Line Graphs With Google Analytics Data In R</h3>



<p>Now we know how to create a single line graph in R with GGPlot2, we should think about adding comparison lines so our SEO reports can show changes. Here’s how we can do that.&nbsp;</p>



<p>First we need to do a little bit of preparation on our data so the two lines can be layered against the same date range. We&#8217;ll do that with subsetting and creating a new frame so the two date ranges are a clear comparison.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GADataComparisonLast30 &lt;- GADataComparison$sessions.d2[31:60]

GADataComparisonPrev30 &lt;- GADataComparison$sessions.d1[1:30]

GADataComparisonVis &lt;- data.frame(GADataComparison$date[31:60], GADataComparisonLast30, GADataComparisonPrev30)

colnames(GADataComparisonVis) &lt;- c("Date", "Sessions Last 30 Days", "Sessions Previous 30 Days")</pre>



<p>Now we can add a second line to our chart like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ggplot(data=GADataComparisonVis) +
  geom_line(aes(y=GADataComparisonLast30, x=Date), colour="darkgreen") +
  geom_line(aes(y=GADataComparisondPrev30, x=Date), colour="black")</pre>



<p>And that’ll give us the following chart:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="615" height="455" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comparisonLineChartGGPlot.png" alt="Comparison line chart using Google Analytics data with GGPlot2" class="wp-image-3088" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comparisonLineChartGGPlot.png 615w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comparisonLineChartGGPlot-300x222.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comparisonLineChartGGPlot-150x111.png 150w" sizes="(max-width: 615px) 100vw, 615px" /></figure>



<p>You can see that we’ve added the colours parameter to our original command which adds the secondary line and I’ve used the &#8220;darkgreen” colour for it, because it matches my site’s theme.</p>



<p>And that’s a really simple introduction to line charts using Google Analytics data in GGPlot2.</p>



<h3 class="wp-block-heading" id="building-bar-charts-with-search-console-data-in-ggplot2">Building Bar Charts With Search Console Data In GGPlot2&nbsp;</h3>



<p>Now we’ve learned how to build basic line graphs, let’s use our Google Search Console dataset to build up a bar chart of impressions over the last 30 days with GGPlot2.</p>



<p>Here’s how to do that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ggplot(data=scDataByDate) +
 geom_col(aes(date, impressions), size = 1, colour = "black", fill = "darkgrey")</pre>



<p>This will give us the following graph (again, your numbers will be different and almost certainly higher):</p>



<figure class="wp-block-image size-full"><img decoding="async" width="615" height="455" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicBarChartGGPlot.png" alt="Basic bar chart with GGPlot2 using Google Search Console Impression data" class="wp-image-3091" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicBarChartGGPlot.png 615w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicBarChartGGPlot-300x222.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/basicBarChartGGPlot-150x111.png 150w" sizes="(max-width: 615px) 100vw, 615px" /></figure>



<p>As before, let’s break the command down:</p>



<ul class="wp-block-list">
<li><strong>ggplot(data=scDataByDate)</strong>: As before, we&#8217;re calling the ggplot command and saying what dataset we&#8217;re focusing on</li>



<li><strong>geom_col(aes(date, impressions)</strong>: In this case, we&#8217;re using the geom_col command to tell ggplot that we&#8217;re using a column graph, and the axes are date and impressions</li>



<li><strong>size = 1, colour = &#8220;black&#8221;, fill = &#8220;darkgrey&#8221;)</strong>: These are styling commands, where we&#8217;re saying that we want our bar size to be standard (you can play with this a lot when you have different size datasets) and we want the outline colour to be black and the fill colour to be dark grey</li>
</ul>



<p>As you can see, there’s not much difference between geom_line and geom_bar. The main changes are that we’re adding a fill parameter for the columns.&nbsp;</p>



<p>So that’s how we can build a basic bar chart in GGPlot2 using Google Search Console impression data. Now let’s build up a combo chart of clicks and impressions.</p>



<h3 class="wp-block-heading" id="building-a-combo-chart-of-google-search-console-impressions-amp;-clicks-in-ggplot2">Building A Combo Chart Of Google Search Console Impressions &amp; Clicks In GGPlot2</h3>



<p>We covered this a bit in my post about <a href="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/">Keyword Clustering for SEO</a>, but I thought it was worth including and expanding in today’s piece.</p>



<p>What we’re going to do here is create a bar chart of impressions as above, but also add a line chart of clicks on a secondary axis. A quick disclaimer is that us analysts generally hate combo charts like this because they make it very easy for the end user to misinterpret the data, but it’s something that people ask a lot, so it’s worth covering for this specific use case.</p>



<p>Use the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ggplot(scDataByDate) +
  geom_col(aes(date, impressions), size = 1, colour = "black", fill = "darkgrey") +
  geom_line(aes(date, 50*clicks), size = 1, colour = "darkgreen", group =1) +
  scale_y_continuous(sec.axis = sec_axis(~./50, name = "clicks"))</pre>



<p>And this will give the following chart using my dataset:&nbsp;</p>



<figure class="wp-block-image size-full"><img decoding="async" width="615" height="455" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comboChartGGPlot.png" alt="Combo chart using GGPlot2 with Google Search Console clicks and impressions" class="wp-image-3097" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comboChartGGPlot.png 615w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comboChartGGPlot-300x222.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/comboChartGGPlot-150x111.png 150w" sizes="(max-width: 615px) 100vw, 615px" /></figure>



<p>Let&#8217;s break it down:</p>



<ul class="wp-block-list">
<li><strong>ggplot(scDataByDate)</strong>: As always, we&#8217;re calling the ggplot command and defining the dataset</li>



<li><strong>geom_col(aes(date, impressions), size = 1, colour = &#8220;black&#8221;, fill = &#8220;darkgrey&#8221;)</strong>: As before, we&#8217;re building our bar chart</li>



<li><strong>geom_line(aes(date, 50*clicks), size = 1, colour = &#8220;darkgreen&#8221;, group =1) + scale_y_continuous(sec.axis = sec_axis(~./50, name = &#8220;clicks&#8221;))</strong>: Here, we&#8217;re building our secondary axis for clicks. Excel or other tools will usually scale this for you automatically, but with R, we have to do that manually. Since my CTR over this time period wasn&#8217;t great compared to my visibility, I&#8217;ve scaled it to 50*. Assuming you spend more time on your own site, you can adapt accordingly</li>
</ul>



<p>There we have it: a quick guide to building combo charts in R using Google Search Console’s impressions and clicks data.</p>



<p>Now we’re going to do our final visualisation of this piece: a wordcloud of our search queries using Google Search Console data.</p>



<h2 class="wp-block-heading" id="creating-wordclouds-in-r-using-google-search-console-data">Creating Wordclouds In R Using Google Search Console Data</h2>



<p>We’re going to use the wordcloud R package for this part, rather than GGPlot2, but it’s still fairly simple. What we’re going to do here is create a wordcloud using our Google Search Console queries over the last 30 days.</p>



<p>Firstly, as always, we’ve got to prepare our data:</p>



<h3 class="wp-block-heading" id="google-search-console-query-data-in-r">Google Search Console Query Data In R</h3>



<p>As before when we added the date dimension to our googleSearchConsoleR query, we’re going to break it down with the query dimension.&nbsp;</p>



<p>Here’s how to do that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">scDataByQuery &lt;- search_analytics("YOUR SITE", startDate = Sys.Date() -30, Sys.Date() -1, searchType = "web", dimensions = "query")</pre>



<p>As you can see, the only difference between this query and the one we used in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">part 2</a> is that we changed the dimension parameter to “query” instead of “date”, meaning it’s broken down by search term instead of by date.&nbsp;</p>



<p>Now, similar to the <a href="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/">Keyword Clustering</a> post, we’ve got to clean up our dataset so it can be easily interpreted and remove potential duplicates, which means turning it into a corpus &#8211; a “bag of words&#8221;, essentially, and then making sure it’s tidied up.</p>



<h3 class="wp-block-heading" id="creating-amp;-cleaning-a-text-corpus-in-r">Creating &amp; Cleaning A Text Corpus In R</h3>



<p>You’re figuring out that there’s always some data-prep work to do before making your visualisations now, right?</p>



<p>Firstly, we’ve got to use the <a href="https://cran.r-project.org/web/packages/tm/tm.pdf" target="_blank" rel="noreferrer noopener">tm</a> (text mining) package to create a corpus from our Search Console queries. We already installed tm at the start of this piece, so now we just need to run the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queryCorpus &lt;- Corpus(VectorSource(scDataByQuery$query))</pre>



<p>Now if you explore that using the “queryCorpus” command, you’ll get the following:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="300" height="24" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/corpusScreenshot-300x24.png" alt="R corpus screenshot" class="wp-image-3106" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/corpusScreenshot-300x24.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/corpusScreenshot-150x12.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/corpusScreenshot-768x63.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/corpusScreenshot.png 943w" sizes="(max-width: 300px) 100vw, 300px" /></figure>



<p>Not especially helpful at this point, right? And certainly not ready for a wordcloud.</p>



<p>That’s why we have to clean it up.</p>



<h3 class="wp-block-heading" id="why-we-clean-text-corpuses">Why We Clean Text Corpuses</h3>



<p>Text analysis is one of the main things I do with R, and something I’ve done for a lot of my SEO clients over the years. There’s a lot to go into with it and I’m not going to run through it all in this series, but hopefully this very brief introduction will whet your appetite to explore it further.&nbsp;</p>



<p>If it does, the book <a href="https://www.tidytextmining.com/" target="_blank" rel="noreferrer noopener">Text Mining With R</a> by Julia Silge and David Robinson is a fantastic resource and well worth a read.&nbsp;</p>



<p>Essentially, the reason we clean a corpus is to avoid duplication and miscounting. In a pure corpus, a word beginning with a capital and fully lowercase will be counted separately, as will plurals and the many variations that can come along with that. You’ll also find that “stopwords” such as “and”, “the” and so on will be counted, and they’ll probably outnumber the words we want to focus on. When we clean a corpus, we eliminate this duplication by doing the following:&nbsp;</p>



<ul class="wp-block-list">
<li><strong>Make everything lowercase:</strong> This ensures consistency of our words, eliminating duplicates caused by capitalisation</li>



<li><strong>Remove punctuation: </strong>Punctuation marks can cause further duplication or variations of words which we don’t want </li>



<li><strong>“Stem” our words:</strong> This removes extensions, thus removing duplicates caused by pluralisation. Some of the words may look a little strange, but it’s still easy to figure them out </li>



<li><strong>Remove “stopwords”:</strong> We want to keep our key terms here rather than including words like “the” and “and” and so on. These will skew our text numbers </li>
</ul>



<p>OK, now we know what we’re going to do, let’s run it. We’re going to have a few commands to use here. I showed you how to combine it into a single R function in my <a href="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/">Keyword Clustering post</a>, but for now, let’s break it down:</p>



<h3 class="wp-block-heading" id="making-a-corpus-lowercase-in-r">Making A Corpus Lowercase In R</h3>



<p>We’re going to use our Google Search Console query corpus that we created in the last step to run the tolower command from the tm R package, like so. You’ll see that we’re rewriting our initial queryCorpus dataset to do this.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queryCorpus &lt;- tm_map(queryCorpus, tolower)</pre>



<p>Simple as that. Now our entire Search Console query corpus is in lowercase.</p>



<h3 class="wp-block-heading" id="removing-punctuation-in-an-r-corpus">Removing Punctuation In An R Corpus</h3>



<p>Removing punctuation is an important step in text analysis. Punctuation can cause unwanted variants, such as hyphenated words being counted as one and in general, we don’t want it around when we’re analysing text. Fortunately, removing it is as simple as one single command, thanks to the tm package:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queryCorpus &lt;- tm_map(queryCorpus, removePunctuation)</pre>



<p>And now we’ve gotten rid of any punctuation elements from our Search Console queries.</p>



<h3 class="wp-block-heading" id="removing-stopwords-in-an-r-corpus">Removing Stopwords In An R Corpus</h3>



<p>For the next stage of our data preparation, we need to remove the stopwords from our corpus, words like “if”, “and”, “it”, “the” and so on; words that will interfere with our analysis of our query data.</p>



<p>We can do that with the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queryCorpus &lt;- tm_map(queryCorpus, removeWords,(stopwords("english")))</pre>



<p>We’re almost there with cleaning our corpus. Now we just need to take the words down to their stems to remove pluralisation duplication.</p>



<h3 class="wp-block-heading" id="stemming-words-in-an-r-corpus">Stemming Words In An R Corpus</h3>



<p>Finally, we need to remove all the possible duplications caused by pluralisation and the many different ways that people can pluralise search terms. Again, using the tm package, we can use the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queryCorpus &lt;- tm_map(queryCorpus, stemDocument)</pre>



<p>We’re still rewriting our original corpus, but now all the words will be in lowercase, punctuation and stopwords are gone and all the possible extensions have been removed through taking the words down to their stems.</p>



<p>If you’ve followed all these steps, you’ll have a clean corpus of your Google Search Console query data and we’re ready to move on to create our wordcloud.</p>



<h3 class="wp-block-heading" id="our-wordcloud-with-google-search-console-query-data-in-r">Our Wordcloud With Google Search Console Query Data In R&nbsp;</h3>



<p>I know it can feel like a lot of heavy lifting to get to this point, with all the data cleaning and preparation, but I promise it’ll get easier when we follow the steps in the next few sessions, particularly the piece when we discuss <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/">functions</a>. But now, we’re ready to create our wordcloud with our Google Search Console query data.&nbsp;</p>



<p>We can do that with the following command using the wordcloud R package, which we installed earlier:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">wordcloud(queryCorpus, scale=c(3,0.4), max.words=350, random.order=FALSE, 
          rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8,"Dark2"))</pre>



<p>Obviously your output will be different, since your site will have different queries, but you’ll get something similar to this:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="615" height="455" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/wordcloud.png" alt="Wordcloud in R using the Wordcloud package and Google Search Console data" class="wp-image-3120" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/wordcloud.png 615w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/wordcloud-300x222.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/wordcloud-150x111.png 150w" sizes="(max-width: 615px) 100vw, 615px" /></figure>



<p>Now we know how to create a wordcloud in R, let’s break down the command so you can edit it in the future:</p>



<ul class="wp-block-list">
<li><strong>wordcloud(queryCorpus</strong>: We&#8217;re calling the wordcloud command from our wordcloud package and using the queryCorpus dataset</li>



<li><strong>scale=c(3,0.4), max.words=350, random.order=FALSE</strong>: We&#8217;re putting a small scale on the image (you can adapt as required), putting a limit on the maximum words of 350 and saying that we don&#8217;t want to randominse the order of words</li>



<li><strong>rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8,&#8221;Dark2&#8243;))</strong>: We&#8217;re putting a 0.35 level of rotation on the visualisation and using the RColorBrewer&#8217;s Dark2 colour scheme</li>
</ul>



<p>Now we’ve had a bit of fun with SEO data visualisation in R, let’s get those graphics exported to use in reports or presentations.</p>



<h2 class="wp-block-heading" id="exporting-data-visualisations-in-rstudio">Exporting Data Visualisations In RStudio</h2>



<p>Saving your R plots to graphics files like JPG or PNG to use in your reports or presentations is really easy with RStudio. You just go to your plot window and click the “export” button, like so:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="300" height="121" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/exportimagerstudio-300x121.png" alt="Export plot RStudio" class="wp-image-3123" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/exportimagerstudio-300x121.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/exportimagerstudio-1024x414.png 1024w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/exportimagerstudio-150x61.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/exportimagerstudio-768x310.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/exportimagerstudio.png 1131w" sizes="(max-width: 300px) 100vw, 300px" /></figure>



<p>You’ll get a few options around the file type. I typically go for PNG so the filesize is small, but you can use whatever works for you.</p>



<p>Now you’ll have a download of your plot in your chosen file type in your R project directory.</p>



<h2 class="wp-block-heading" id="wrapping-up">Wrapping Up</h2>



<p>And there we have it. A simple introduction to using GGPlot2 and wordcloud to visualise your SEO data using R. As always, if you’ve got any questions, hit me up on <a href="https://twitter.com/ben_johnston80" target="_blank" rel="noreferrer noopener">Twitter</a> and be sure to sign up to my mailing list to get notified of my next piece of content.&nbsp;</p>



<p>Next time, we’re going to start taking R to the next level and look at functions. I hope you’ll join me.</p>



<h3 class="wp-block-heading" id="our-code-from-today">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Install Packages

instPacks &lt;- c("tidyverse", "googleAnalyticsR", "searchConsoleR", "googleAuthR", "tm", "wordcloud")

lapply(instPacks, require, character.only = TRUE)

# Authorise Google Analytics

ga_auth()

gaAccounts &lt;- ga_account_list()

viewID &lt;- gaAccounts$viewId[7]

# Get Google Analtyics Data

orgSegment &lt;- segment_ga4("orgSegment", segment_id = "gaid::-5")

ga30Days &lt;- google_analytics(viewID, date_range =c("30DaysAgo","yesterday"), metrics = "sessions",
                             dimensions = "date",segment= orgSegment)

GADataComparison &lt;- google_analytics(viewID, date_range =c("60DaysAgo", "31DaysAgo", "30DaysAgo","yesterday"), metrics = "sessions",
                                     dimensions = "date",segment= orgSegment)

# Get Search Console Data

scr_auth()

scDataByDate &lt;- search_analytics("YOUR SITE", startDate = Sys.Date() -30, Sys.Date() -1, searchType = "web", dimensions = "date")

# Basic Line Graph With 30 Days Data

ggplot(data=ga30Days, aes(x=date, y=sessions)) +
  geom_line(colour="darkgreen", stat="identity") + xlab("Date") + ylab("Sessions")

## Comparison Data

GADataComparisonLast30 &lt;- GADataComparison$sessions.d2[31:60]

GADataComparisonPrev30 &lt;- GADataComparison$sessions.d1[1:30]

GADataComparisonVis &lt;- data.frame(GADataComparison$date[31:60], GADataComparisonLast30, GADataComparisonPrev30)

colnames(GADataComparisonVis) &lt;- c("Date", "Sessions Last 30 Days", "Sessions Previous 30 Days")

## Comparison Line Chart

ggplot(data=GADataComparisonVis) +
  geom_line(aes(y=GADataComparisonLast30, x=Date), colour="darkgreen") +
  geom_line(aes(y=GADataComparisondPrev30, x=Date), colour="black")

# Bar Chart From SC Impressions

ggplot(data=scDataByDate) +
 geom_col(aes(date, impressions), size = 1, colour = "black", fill = "darkgrey")

## Combo Chart With SC Impressions &amp; Clicks

ggplot(scDataByDate)+
  geom_col(aes(date, impressions), size = 1, colour = "black", fill = "darkgrey")+
  geom_line(aes(date, 50*clicks), size = 1, colour = "darkgreen", group =1)+
  scale_y_continuous(sec.axis = sec_axis(~./50, name = "clicks"))

# Wordcloud With Search Console Queries

## Get Query Data

scDataByQuery &lt;- search_analytics("YOUR SITE", startDate = Sys.Date() -30, Sys.Date() -1, searchType = "web", dimensions = "query")

## Convert Queries To Corpus

queryCorpus &lt;- Corpus(VectorSource(scDataByQuery$query))

## Convert Corpus To Lower Case

queryCorpus &lt;- tm_map(queryCorpus, tolower)

## Remove Punctuation

queryCorpus &lt;- tm_map(queryCorpus removePunctuation)

## Remove Stopwords

queryCorpus &lt;- tm_map(queryCorpus, removeWords,(stopwords("english")))

## Stem Words

queryCorpus &lt;- tm_map(queryCorpus, stemDocument)

## Create Wordcloud

wordcloud(queryCorpus, scale=c(3,0.4), max.words=350, random.order=FALSE, 
          rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8,"Dark2"))</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&amp;linkname=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud%2F&#038;title=R%20For%20SEO%20Part%203%3A%20Data%20Visualisation%20With%20GGPlot2%20%26%20Wordcloud" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" data-a2a-title="R For SEO Part 3: Data Visualisation With GGPlot2 &amp; Wordcloud"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO part 4: functions" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 4: Functions</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 2: Packages" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R For SEO Part One | Ben Johnston" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 1: The Basics</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 2: Packages, Google Analytics &amp; Search Console With R</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Mon, 14 Feb 2022 04:03:00 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">http://167.71.131.91/?p=2998</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</a></p>
<p>Hello and welcome to part two of my series on using the R programming language for SEO. Hopefully you enjoyed part one...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&#038;title=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" data-a2a-title="R For SEO Part 2: Packages, Google Analytics &amp; Search Console With R"></a></p>
<p>Hello and welcome to part two of my series on using the <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R programming language for SEO.</a> Hopefully you enjoyed <a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">part one</a> and are raring to go with today’s piece. Today, we’re going to be looking at R packages and as an example, we’re going to be using packages to directly pull in data from Google Analytics and Google Search Console. That’s right, no more downloading CSV exports and importing them.</p>



<p>If you find this useful, please consider giving it a share on your social networks and if you’d like to be kept up to date with my latest posts on this site, sign up for my free mailing list. No spam, no sales pitches, just updated content whenever it’s ready.</p>


<div class="frm_forms  with_frm_style frm_style_formidable-style" id="frm_form_15_container" >
<form enctype="multipart/form-data" method="post" class="frm-show-form  frm_pro_form " id="form_zjp907" >
<div class="frm_form_fields ">
<fieldset>
<div class="frm_fields_container">
<input type="hidden" name="frm_action" value="create" />
<input type="hidden" name="form_id" value="15" />
<input type="hidden" name="frm_hide_fields_15" id="frm_hide_fields_15" value="" />
<input type="hidden" name="form_key" value="zjp907" />
<input type="hidden" name="item_meta[0]" value="" />
<input type="hidden" id="frm_submit_entry_15" name="frm_submit_entry_15" value="2a43233bc9" /><input type="hidden" name="_wp_http_referer" value="/feed/?def2=1738006258" /><div id="frm_field_126_container" class="frm_form_field form-field  frm_required_field frm_top_container">
    <label for="field_v80ub2" class="frm_primary_label">What&#8217;s Your Email Address?
        <span class="frm_required">*</span>
    </label>
    <input type="email" id="field_v80ub2" name="item_meta[126]" value=""  data-reqmsg="What&#039;s Your Email Address? can&#039;t be blank." aria-required="true" data-invmsg="What&#039;s Your Email Address? is invalid" aria-invalid="false"  />
    
    
</div>
<div id="frm_field_241_container" class="frm_form_field form-field ">
<div class="frm_submit">

<input type="submit" value="Sign Up"  class="frm_final_submit" formnovalidate="formnovalidate" />
<img decoding="async" class="frm_ajax_loading" src="https://www.ben-johnston.co.uk/wp-content/plugins/formidable/images/ajax_loader.gif" alt="Sending"/>

</div>
</div>
<input type="hidden" name="item_key" value="" />
<div class="frm__653a75d21b915">
<label for="frm_email_15" >
If you are human, leave this field blank.</label>
<input  id="frm_email_15" type="text" class="frm_verify" name="frm__653a75d21b915" value="" autocomplete="off"  />
</div>
<input name="frm_state" type="hidden" value="DSRtCP0OhNtz570OZTVmXXa9I1eWv5Zd7hJv1CkmkL8=" /></div>
</fieldset>
</div>

<p style="display: none !important;" class="akismet-fields-container" data-prefix="ak_"><label>&#916;<textarea name="ak_hp_textarea" cols="45" rows="8" maxlength="100"></textarea></label><input type="hidden" id="ak_js_1" name="ak_js" value="231"/><script>document.getElementById( "ak_js_1" ).setAttribute( "value", ( new Date() ).getTime() );</script></p></form>
</div>



<p></p>



<div class="wp-block-advanced-gutenberg-blocks-summary"><p class="wp-block-advanced-gutenberg-blocks-summary__title">Contents</p><div class="wp-block-advanced-gutenberg-blocks-summary__fold"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-chevron-up"><polyline points="18 15 12 9 6 15"></polyline></svg></div><ol role="directory" class="wp-block-advanced-gutenberg-blocks-summary__list"><li><a href="https://www.ben-johnston.co.uk#r-packages">R Packages</a><ol><li><a href="https://www.ben-johnston.co.uk#the-tidyverse-package">The&nbsp;Tidyverse&nbsp;Package</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#installing-r-packages">Installing R Packages&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#installing-multiple-r-packages-at-once">Installing Multiple R Packages&nbsp;At&nbsp;Once</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#using-google-analytics-in-r">Using Google Analytics&nbsp;In&nbsp;R&nbsp;</a><ol><li><a href="https://www.ben-johnston.co.uk#authenticating-googleanalyticsr">Authenticating&nbsp;GoogleAnalyticsR&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#querying-google-analytics-data-in-r-with-googleanalyticsr">Querying&nbsp;Google Analytics Data&nbsp;In&nbsp;R With&nbsp;GoogleAnalyticsR&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#our-first-googleanalyticsr-query">Our First&nbsp;GoogleAnalyticsR&nbsp;Query</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#multiple-metrics-in-googleanalyticsr">Multiple Metrics&nbsp;In&nbsp;GoogleAnalyticsR&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#adding-segments-in-googleanalyticsr">Adding Segments&nbsp;In&nbsp;GoogleAnalyticsR&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#adding-dimensions-to-googleanalyticsr">Adding Dimensions&nbsp;To&nbsp;GoogleAnalyticsR</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#using-dynamic-date-ranges-in-googleanalyticsr">Using Dynamic Date Ranges&nbsp;In&nbsp;GoogleAnalyticsR</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#using-google-search-console-in-r">Using Google Search Console In R&nbsp;</a><ol><li><a href="https://www.ben-johnston.co.uk#getting-google-search-console-data-with-searchconsoler">Getting Google Search Console Data&nbsp;With&nbsp;SearchConsoleR&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#using-dimensions-in-searchconsoler">Using Dimensions In&nbsp;searchConsoleR&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#using-date-dimensions-in-searchconsoler">Using Date Dimensions In&nbsp;searchConsoleR&nbsp;</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#wrapping-up">Wrapping Up&nbsp;</a><ol><li><a href="https://www.ben-johnston.co.uk#our-code-from-today">Our Code From Today</a><ol></ol></li></ol></li></ol></div>



<p></p>



<h2 class="wp-block-heading" id="r-packages">R Packages</h2>



<p>I’ve mentioned this before in some of my other <a href="https://www.ben-johnston.co.uk/category/r/">R&nbsp;blog&nbsp;posts</a>,&nbsp;but packages are essentially plugins or extensions for the language. They’re a series of&nbsp;custom-built&nbsp;functions that are wrapped up and shared into one easily installable bundle and they’re often the quickest way to solve problems.&nbsp;</p>



<p>There are&nbsp;over 18,000&nbsp;packages available on <a href="https://cran.r-project.org/web/packages/available_packages_by_name.html" target="_blank" rel="noreferrer noopener">CRAN</a> (the official R repository) and thousands more on GitHub and they’re a fantastic way to share&nbsp;or reuse&nbsp;your code. Although I won’t be covering how to build one in this series, it may be something I go back to later. It’s actually easy and I recommend creating a personal R package if you find you’ve got a lot of <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/"  data-wpil-monitor-id="213">functions</a> that you re-use. I have one myself and it’s saved&nbsp;me&nbsp;so much time.&nbsp;</p>



<p>As I say, I won’t be covering building your own R package in this piece, but there are several excellent resources below that you may want to check out as you get further on in your R journey:</p>



<ul class="wp-block-list"><li><strong><a href="https://support.rstudio.com/hc/en-us/articles/200486488-Developing-Packages-with-the-RStudio-IDE" target="_blank" rel="noreferrer noopener">The Rstudio Package Dev Guide</a></strong></li><li><strong><a href="https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/" target="_blank" rel="noreferrer noopener">Hilary Parker&#8217;s Guide To Building A Personal R Package</a></strong></li><li><strong><a href="https://r-pkgs.org/package-within.html" target="_blank" rel="noreferrer noopener">Hadley Wickham&#8217;s Advanced R Programming Guide To Packages</a></strong></li></ul>



<p>OK, now&nbsp;we know what an R package is, let’s install one.</p>



<h3 class="wp-block-heading" id="the-tidyverse-package">The&nbsp;Tidyverse&nbsp;Package</h3>



<p>With over 18,000 official R packages available, it can be a bit tricky to know where to start. As you get further into your R journey and start facing specific problems, you’ll become very familiar with&nbsp;Google and Stack Overflow. Nine times out of ten, you’ll find the answer there.&nbsp;</p>



<p>But for today, let’s install one package (or more accurately, a group of packages) that I&nbsp;almost never start an R project without:&nbsp;the&nbsp;incredible&nbsp;<a href="https://hadley.nz/" target="_blank" rel="noreferrer noopener">Hadley Wickham’s</a>&nbsp;<a href="https://www.tidyverse.org/" target="_blank" rel="noreferrer noopener">Tidyverse&nbsp;</a>package.&nbsp;</p>



<p>I’ve&nbsp;long&nbsp;been an&nbsp;admirer of&nbsp;Mr&nbsp;Wickham’s work, particularly his dedication to making the R universe a better, tidier place&nbsp;and the&nbsp;Tidyverse&nbsp;package is essential. It incorporates the following other packages, which we’ll be becoming very familiar with the further we go through this series:&nbsp;</p>



<ul class="wp-block-list"><li><strong>ggpot2:</strong>&nbsp;The essential graphics and&nbsp;visualisation&nbsp;package for R</li><li><strong>dplyr:&nbsp;</strong>One of the most common and easiest to use data manipulation packages. We’ll be using functions from this one a&nbsp;<em>lot</em></li><li><strong>tidyr:&nbsp;</strong>A fantastic series of functions for tidying up data sets. The more you do with R, the more familiar you’ll become with having to tidy up data, so this will become something you’ll&nbsp;use quite&nbsp;regularly</li><li><strong>readr:&nbsp;</strong>A package for reading rectangular data like CSVs, TSVs and other similar files</li><li><strong>purrr:&nbsp;</strong>A series of functions to help making working with functions&nbsp;and vectors easier</li><li><strong>tibble:</strong>&nbsp;Possibly the most annoying package ever created for&nbsp;people who don’t write clean code!&nbsp;Tibble&nbsp;is a new way of working with&nbsp;dataframes&nbsp;which, in Wickham’s own words “Does less and complains more” forcing you to write cleaner, more expressive code</li><li><strong>stringr:&nbsp;</strong>Another essential package for SEOs since we work with text data (strings) quite a bit and need quick, easy functions to&nbsp;deal with that</li><li><strong>forcats:</strong>&nbsp;This package offers a series of functions to make dealing with factors better or, frankly, tolerable</li></ul>



<p>As you can see, there’s a lot in the&nbsp;Tidyverse&nbsp;that we’ll be using quite a bit as we go on&nbsp;through this series.&nbsp;So&nbsp;let’s&nbsp;get this installed for our first package installation.</p>



<h3 class="wp-block-heading" id="installing-r-packages">Installing R Packages&nbsp;</h3>



<p>Installing R packages is really simple once you know the name of the one you want to install.&nbsp;</p>



<p>As mentioned in the previous post on the basics of R, it’s always best practice to put your package installations into&nbsp;your&nbsp;.R&nbsp;file at the top of&nbsp;RStudio&nbsp;just so you can run the whole code back without trying to remember what packages you used.&nbsp;There are other options like Packrat, but&nbsp;since I work out of my Dropbox&nbsp;most&nbsp;of the time, this causes more problems than it solves for me, so I tend to just keep them at the top of&nbsp;my .R&nbsp;file.&nbsp;</p>



<p>In your&nbsp;text area, type the following:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">install.packages("tidyverse")</pre>



<p>Remember that R is case sensitive,&nbsp;so you’ve got to be accurate or else you’ll get an error.&nbsp;</p>



<p>Now add the following:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">library(tidyverse)</pre>



<p>Now paste these two commands into your&nbsp;RStudio&nbsp;console. You’ll see the following:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="300" height="69" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/Screenshot_2-300x69.png" alt="Installing Tidyverse Package In R" class="wp-image-3003" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/Screenshot_2-300x69.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/Screenshot_2-1024x234.png 1024w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/Screenshot_2-150x34.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/Screenshot_2-768x175.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/Screenshot_2.png 1388w" sizes="(max-width: 300px) 100vw, 300px" /></figure>



<p>That’s it. Your&nbsp;Tidyverse&nbsp;package is installed and activated.&nbsp;The “install.packages” command downloads the package to your R environment and the “library” command activates it.&nbsp;&nbsp;</p>



<p>Unless you update your R installation, the installed package will stay installed and you’ll only need the “library” command to&nbsp;initialise&nbsp;it, but I recommend reinstalling regularly in case there have been updates.&nbsp;</p>



<h3 class="wp-block-heading" id="installing-multiple-r-packages-at-once">Installing Multiple R Packages&nbsp;At&nbsp;Once</h3>



<p>It’s nice to install our R packages as and when we need them, but if we know what&nbsp;packages we’ll be using at the start of our project, wouldn’t it be nice to cut down the lines we need to write? Here’s how you can install several R packages at once rather than&nbsp;numerous separate lines of code.&nbsp;</p>



<p>Firstly, we need to create a&nbsp;data frame&nbsp;containing the names of all the packages that we want to install.&nbsp;We’ll talk through the&nbsp;c( command&nbsp;in more detail shortly, but&nbsp;for now, just know it means “combine” and we’re&nbsp;going to create a list of the packages we want to install for this piece.&nbsp;</p>



<p>Rather than typing install.packages(“package”), library(package) like we did for the&nbsp;Tidyverse&nbsp;for every single package we want to use, if we know the packages we’re going to work with for our project, we can create a list like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">instPacks &lt;- c("tidyverse", "googleAnalyticsR", "searchConsoleR", "googleAuthR")</pre>



<p>And then we can&nbsp;initalise&nbsp;them by calling the list with the Require command like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">lapply(instPacks, require, character.only = TRUE)</pre>



<p>If you know what packages you’re going to use at the start of your R project, or if you’re going to be running the same analysis for multiple clients, this is a great way to save time and make your code more efficient. It doesn’t work with every package, especially if you’re using them for the first time and there are&nbsp;authorisation&nbsp;requirements, or they’re coming from GitHub, but it’s always worth a try.</p>



<p>Now we’ve got our packages installed, let’s start getting some Google Analytics data directly into R using the&nbsp;<a href="https://cran.r-project.org/web/packages/googleAnalyticsR/index.html" target="_blank" rel="noreferrer noopener">googleAnalyticsR</a>&nbsp;package.</p>



<h2 class="wp-block-heading" id="using-google-analytics-in-r">Using Google Analytics&nbsp;In&nbsp;R&nbsp;</h2>



<p>Now we’ve got our packages installed, we have to authenticate the&nbsp;googleAnalyticsR&nbsp;package with our Google Analytics account.</p>



<p>Here’s how to do that:</p>



<h3 class="wp-block-heading" id="authenticating-googleanalyticsr">Authenticating&nbsp;GoogleAnalyticsR&nbsp;</h3>



<p>First, type:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ga_auth()</pre>



<p>This will open a browser window with the following dialogue:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="230" height="300" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-230x300.png" alt="Authorise GoogleAnalyticsR" class="wp-image-3012" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-230x300.png 230w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-784x1024.png 784w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-115x150.png 115w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-768x1004.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth.png 893w" sizes="(max-width: 230px) 100vw, 230px" /></figure>



<p>I’ve had some issues with this when the R version isn’t up to date, so make sure you’re always using the latest version of the language.&nbsp;</p>



<p>Make the following selections:&nbsp;</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="174" height="300" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-2-174x300.png" alt="GoogleAnalyticsR Auth Selections" class="wp-image-3013" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-2-174x300.png 174w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-2-592x1024.png 592w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-2-87x150.png 87w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-2-768x1327.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-2-889x1536.png 889w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/ga-auth-2.png 902w" sizes="(max-width: 174px) 100vw, 174px" /></figure>



<p>It will give you an&nbsp;authorisation&nbsp;code. Paste that into your R console&nbsp;</p>



<p>There you go, your R session is linked up with Google Analytics. Let’s get some data.</p>



<h3 class="wp-block-heading" id="querying-google-analytics-data-in-r-with-googleanalyticsr">Querying&nbsp;Google Analytics Data&nbsp;In&nbsp;R With&nbsp;GoogleAnalyticsR&nbsp;</h3>



<p>First, we need to create a dataset with our list of Google Analytics accounts and views. You can do that with the following command:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gaAccounts &lt;- ga_account_list()</pre>



<p>Now if you type</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">gaAccounts</pre>



<p>in your console, you’ll get the full list of Google&nbsp;Analtyics&nbsp;accounts and views that are associated with the verified login you used. It’ll look something like this:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="300" height="13" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/gaAccounts-300x13.png" alt="GoogleAnalyticsR Account List Headers" class="wp-image-3015" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/gaAccounts-300x13.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/gaAccounts-1024x43.png 1024w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/gaAccounts-150x6.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/gaAccounts-768x32.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/gaAccounts.png 1309w" sizes="(max-width: 300px) 100vw, 300px" /></figure>



<p>If you have multiple views and accounts in your Google Analytics account the way I do, you’re going to want to create an object with the view you want to work with. By looking at the table above (your own Google Analytics data will differ, obviously), we want to use the skills we learned in the previous post to explore our data and create an object in our R environment with it.&nbsp;</p>



<p>When you look through the output below, you’ll be able to identify the row number of the Google Analytics View you want to focus on. Once you know the row number, type the following:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">viewID &lt;- gaAccounts$viewId[ROW NUMBER]</pre>



<p>Here, we’re creating a new R dataset with the specific row that we want to focus on from the&nbsp;viewID&nbsp;variable. When you look at the&nbsp;gaAccounts&nbsp;dataset, you should be able to identify the row number from the&nbsp;farthest left column. Put that number in the square brackets and your&nbsp;viewID&nbsp;data will be created, meaning you can use that instead of putting the specific number in with every API call.&nbsp;</p>



<p>Now we have that sorted, let’s make our first Google Analytics query in R.</p>



<h3 class="wp-block-heading" id="our-first-googleanalyticsr-query">Our First&nbsp;GoogleAnalyticsR&nbsp;Query</h3>



<p>For a test, let’s just get sessions over the last seven days.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">testData &lt;- google_analytics(viewID, date_range = c("2022-02-01","2022-02-08"), metrics = "sessions")</pre>



<p>At the time of writing this piece, that was the seven-day date range. Your mileage may well vary, but you can hopefully see how this works. Let’s break it down:&nbsp;</p>



<ul class="wp-block-list"><li><strong>testData&nbsp;&lt;-:</strong>&nbsp;This is the name of our dataset. Since it’s just a test to make sure our connection is working properly, we’re just calling it “testData” for now&nbsp;</li><li><strong>google_analytics(:</strong>&nbsp;We’re telling R that we want to use the&nbsp;google_analytics&nbsp;function from the&nbsp;googleAnalyticsR&nbsp;package, we’re going to be downloading data from Google Analytics</li><li><strong>viewID:</strong>&nbsp;We’re invoking the&nbsp;viewID&nbsp;data that we created in the previous section,&nbsp;telling&nbsp;googleAnalyticsR&nbsp;what&nbsp;Google Analytics&nbsp;view we want to query&nbsp;</li><li><strong>date_range&nbsp;=c(:</strong>&nbsp;The date range we want to query. The&nbsp;c( command&nbsp;means “Combine”, something we’ll be using a lot more as we go. In this case, we’re going to combine our start and end dates&nbsp;</li><li><strong>metrics=:</strong>&nbsp;The metrics from Google Analytics that we want to get our data from. In the case of our first query, we just want to get “sessions”, but in the next section, we’ll be gathering more&nbsp;</li></ul>



<p>If everything’s gone to plan, typing the following&nbsp;into your console:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">testData</pre>



<p>Will bring up the following:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="179" height="62" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/testdata-1.png" alt="Sessions Data From Google Analytics In R" class="wp-image-3024" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/testdata-1.png 179w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/testdata-1-150x52.png 150w" sizes="(max-width: 179px) 100vw, 179px" /></figure>



<p>Great, so we know it works. Now let’s get some more data and make it SEO-specific.</p>



<h3 class="wp-block-heading" id="multiple-metrics-in-googleanalyticsr">Multiple Metrics&nbsp;In&nbsp;GoogleAnalyticsR&nbsp;</h3>



<p>First,&nbsp;we want to start building our query to include multiple metrics to look at our website’s performance.&nbsp;</p>



<p>Let’s get Sessions, Users,&nbsp;Pageviews&nbsp;and Bounce Rate.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GAData &lt;- google_analytics(viewID, date_range =c("2022-02-01","2022-02-08"), metrics = c("sessions", "users", "pageviews","bouncerate"))</pre>



<p>Again, there’s that&nbsp;c( command&nbsp;again. In this case, we’re telling Google Analytics that we want to combine all these metrics into a single&nbsp;dataframe.&nbsp;</p>



<p>If we run it&nbsp;using&nbsp;GAData, we’ll see the following:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="300" height="33" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/GAData-300x33.png" alt="Headers From Multiple Metrics In GoogleAnalyticsR" class="wp-image-3022" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/GAData-300x33.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/GAData-150x16.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/GAData.png 604w" sizes="(max-width: 300px) 100vw, 300px" /></figure>



<p>Nice to see, but not particularly useful. Let’s start breaking it down a bit with dimensions and segments.&nbsp;</p>



<h3 class="wp-block-heading" id="adding-segments-in-googleanalyticsr">Adding Segments&nbsp;In&nbsp;GoogleAnalyticsR&nbsp;</h3>



<p>Since this whole series is about using <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a>, it makes sense that we’d start pulling our Google Analytics data using the organic search segment. Here’s how we can do that.&nbsp;</p>



<p>There are a couple of steps to this. First, since every Google Analytics account&nbsp;may well be different,&nbsp;we need to&nbsp;find the ID of our organic segment. On top of that, going forwards, you may want to get the IDs of other segments you have, so it makes sense to create a&nbsp;dataframe&nbsp;of them all. You can do that like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GASegments &lt;- ga_segment_list()</pre>



<p>Now our&nbsp;gaSegments&nbsp;dataset has all the IDs of the segments in our Google Analytics account.&nbsp;Look through that dataset using the&nbsp;gaSegments&nbsp;command in the console to find your ID, or search for it using the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">which(GASegments$name == "Organic Traffic")</pre>



<p>From here, whichever route we go, we know the ID of our organic search segment. In my case, it’s 5.&nbsp;Now we need to define that segment for&nbsp;googleAnalyticsR. You can do that like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">orgSegment &lt;- segment_ga4("orgSegment", segment_id = "gaid::-5")</pre>



<p>This has been a bit of a mission for the first run, but now we’ve got our organic&nbsp;search&nbsp;segment defined&nbsp;in R, we can work that into our original query like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GADataOrg &lt;- google_analytics(viewID, date_range =c("2022-02-01","2022-02-08"), metrics = c("sessions", "users", "pageviews","bouncerate"), segment= orgSegment)</pre>



<p>Take a look at it by typing&nbsp;GAData&nbsp;into the console.&nbsp;</p>



<p>And there we go. We’ve got the same&nbsp;dataframe&nbsp;as before, but using&nbsp;just&nbsp;organic search data.&nbsp;You’ll notice the “Segment” header added to the&nbsp;frame.&nbsp;</p>



<p>Now let’s break it down by the date dimension&nbsp;so we can see some performance trends.</p>



<h3 class="wp-block-heading" id="adding-dimensions-to-googleanalyticsr">Adding Dimensions&nbsp;To&nbsp;GoogleAnalyticsR</h3>



<p>To break your Google Analytics data down by dimension in R, we&nbsp;just&nbsp;need to&nbsp;add the&nbsp;dimension parameter to our Google Analytics query, like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GADataOrgDates &lt;- google_analytics(viewID, date_range =c("2022-02-01","2022-02-08"), metrics = c("sessions", "users", "pageviews","bouncerate"), dimensions = "date", segment= orgSegment)</pre>



<p>Now you can see your performance over that date range&nbsp;using the organic segment, but we don’t want to figure out the date ranges manually every time, do we? Let’s see how to&nbsp;make the date ranges dynamic for every time we call it.</p>



<p>This will be important for later pieces in this series, and we’ll be covering other dimensions as well, so if you’re not familiar, it’s worth taking a look at the Google Analytics dimensions list.</p>



<h3 class="wp-block-heading" id="using-dynamic-date-ranges-in-googleanalyticsr">Using Dynamic Date Ranges&nbsp;In&nbsp;GoogleAnalyticsR</h3>



<p>This is actually&nbsp;quite&nbsp;easy, but&nbsp;wasn’t immediately obvious&nbsp;to me&nbsp;from the documentation&nbsp;when I first started using&nbsp;googleAnalyticsR, so hopefully this helps.&nbsp;</p>



<p>First, we&nbsp;need to edit our&nbsp;query using the&nbsp;c( parameter, but saying how many days we want to cover&nbsp;using the API shortcuts, like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">GADataDynamicDates &lt;- google_analytics(viewID, date_range =c("7DaysAgo","yesterday"), metrics = c("sessions", "users", "pageviews","bouncerate"),dimensions = "date", segment= orgSegment)</pre>



<p>As you can see, there are options here. There’s a lot of flexibility available if you&nbsp;use the following&nbsp;API shortcuts.</p>



<p>And there we have it, a rolling date&nbsp;dimension&nbsp;in&nbsp;googleAnalyticsR, using the organic search segment.&nbsp;Again, this is something that we’re going to be using a lot more&nbsp;as we go through this series, so it’s worth&nbsp;getting familiar with it now. And be sure to save this dataset, because we’re going to be using this as a basis for the next few posts.</p>



<p>Now let’s take what we’ve learned about using R packages and Google APIs&nbsp;to the Google Search Console&nbsp;R&nbsp;package, searchConsoleR.</p>



<h2 class="wp-block-heading" id="using-google-search-console-in-r">Using Google Search Console In R&nbsp;</h2>



<p>The&nbsp;<a href="https://cran.r-project.org/web/packages/searchConsoleR/index.html" target="_blank" rel="noreferrer noopener">searchConsoleR&nbsp;package</a>&nbsp;has a lot of great features and it’s pretty easy to use as well, so that’s what we’ll use&nbsp;to work with Google Search Console in R.&nbsp;</p>



<p>As with the&nbsp;googleAnalyticsR&nbsp;package, we need to&nbsp;authorise&nbsp;the R environment with our account. We can do that with the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">scr_auth()</pre>



<p>As with googleAnalyticsR, it&#8217;ll give you an authorisation option in your console window and then open a new browser window. I don&#8217;t generally install the httpuv package as I change accounts so often with this, but your mileage may vary.</p>



<p>Allow access to your chosen Google account and&nbsp;you’ll see an&nbsp;authorisation code. Paste this in and you&#8217;ll see a message in your console saying you’re&nbsp;authorised.&nbsp;</p>



<p>OK, now we’re linked up, let’s get some data.</p>



<h3 class="wp-block-heading" id="getting-google-search-console-data-with-searchconsoler">Getting Google Search Console Data&nbsp;With&nbsp;SearchConsoleR&nbsp;</h3>



<p>The&nbsp;searchConsoleR&nbsp;package gives you a wide range of functions, including rewriting your data, deleting sites and much more, which are a bit beyond the scope of this series, since we’re just looking at how we can use <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a>-specific data analysis, but there’s lots covered in the documentation.&nbsp;</p>



<p>As an example, let’s get our Google Search Console&nbsp;data for the last seven days. We can do that like so:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">scData &lt;- search_analytics("https://www.ben-johnston.co.uk", startDate = Sys.Date() -7, Sys.Date() -1, searchType = "web")</pre>



<p>You’ll get a warning that Search Console data isn’t accurate for the last three days, but it’ll still give you something to work with.&nbsp;If&nbsp;you’ve been working in SEO for more than fifteen minutes, you’ll know&nbsp;that Search Console&nbsp;data&nbsp;is&nbsp;not what you’d describe as reliable&nbsp;at the best of times, so this isn’t a huge shock.&nbsp;That aside, let’s break the query down:&nbsp;</p>



<ul class="wp-block-list"><li><strong>scData&nbsp;&lt;-:&nbsp;</strong>What we’re naming our dataset</li><li><strong>search_analytics(:&nbsp;</strong>We’re calling the search analytics part of the API, allowing us to get Google Search Console performance data&nbsp;</li><li><strong>“https://www.ben-johnston.co.uk”,:&nbsp;</strong>The name of the website we want to work with. You’ll need to put the full URL in there. I’ve used this site for this example</li><li><strong>startDate&nbsp;=&nbsp;Sys.Date() -7,:</strong>&nbsp;The Search Console API doesn’t have the same shortcuts as the Google Analytics API, so&nbsp;we need to be a little bit sneaky to get our rolling date ranges. Here, we’re using the&nbsp;Sys.Date() command to get the computer’s date and -7 to tell it to start seven days ago&nbsp;</li><li><strong>Sys.Date()-1,:&nbsp;</strong>As above, we’re telling the&nbsp;API call that the end of our date range is the computer’s date to yesterday&nbsp;</li><li><strong>searchType&nbsp;= “web”):&nbsp;</strong>We’re ending our API call by telling Search Console which type of search we want to look at. Obviously, Search Console breaks the different searches down by search type,&nbsp;so there are options here. I’m just using “web” for this example&nbsp;</li></ul>



<p>If this runs correctly, you’ll get a&nbsp;dataframe&nbsp;like so:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="300" height="28" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scData-300x28.png" alt="Google Search Console Data in R" class="wp-image-3037" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scData-300x28.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scData-150x14.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scData-630x61.png 630w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scData.png 651w" sizes="(max-width: 300px) 100vw, 300px" /></figure>



<p>Handy, right? Much quicker than downloading it manually. But not that helpful if we want to look at performance trends. Let’s look at how we can add dimensions to it.</p>



<h3 class="wp-block-heading" id="using-dimensions-in-searchconsoler">Using Dimensions In&nbsp;searchConsoleR&nbsp;</h3>



<p>As with the Google Analytics API, we can use dimensions to break our Google Search Console queries down.</p>



<p>As we all know, Google Search Console doesn’t have quite as much flexibility as Google Analytics, but hopefully by using dimensions, you’ll get&nbsp;some ideas of how you can use Google Search Console with R.</p>



<p>The dimensions available in the searchConsoleR package are:</p>



<ul class="wp-block-list"><li><strong>date</strong></li><li><strong>country</strong></li><li><strong>device</strong></li><li><strong>page</strong></li><li><strong>query</strong></li><li><strong>searchAppearance</strong></li></ul>



<p>Now we have our list, let’s <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/"   title="apply" data-wpil-keyword-link="linked"  data-wpil-monitor-id="227">apply</a> a date dimension to our data.</p>



<h3 class="wp-block-heading" id="using-date-dimensions-in-searchconsoler">Using Date Dimensions In&nbsp;searchConsoleR&nbsp;</h3>



<p>As with our&nbsp;googleAnalyticsR&nbsp;queries, if we want to add a dimension to our Google Search Console R query, we need to add the “dimension” parameter to our command. We can do that like so:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">scDataByDate &lt;- search_analytics("https://www.ben-johnston.co.uk", startDate = Sys.Date() -7, Sys.Date() -1, searchType = "web", dimensions = "date")</pre>



<p>You’ll notice that nothing’s changed apart from adding the “dimensions = “date” parameter to our original call, which we discovered in the previous section.&nbsp;From running this, we’ll get the same&nbsp;dataframe&nbsp;as above, but broken down by date like so:</p>



<figure class="wp-block-image size-medium"><img decoding="async" width="300" height="24" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scDataByDate-300x24.png" alt="Google Search Console Data Split By Date in R" class="wp-image-3042" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scDataByDate-300x24.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scDataByDate-150x12.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scDataByDate-768x62.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/scDataByDate.png 834w" sizes="(max-width: 300px) 100vw, 300px" /></figure>



<h2 class="wp-block-heading" id="wrapping-up">Wrapping Up&nbsp;</h2>



<p>And there we go. We’ve learned how to install and initialise R packages and had a tour of the main Google Analytics and Google Search Console R packages. Remember to save your script from this piece, because we’re going to be using these datasets in the next few articles.&nbsp;</p>



<p>As always, if you have any questions, drop me a line on <a href="https://twitter.com/ben_johnston80" target="_blank" rel="noreferrer noopener">Twitter</a> or through the <a href="https://www.ben-johnston.co.uk/get-in-touch/">contact form</a>, and I’ll see you in a few days for the next piece, where we’ll talk about SEO data visualisation in R.</p>



<h3 class="wp-block-heading" id="our-code-from-today">Our Code From Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="Ben" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">## Install Single Package

install.packages("tidyverse")

library(tidyverse)

install.packages("googleAnalyticsR")

library(googleAnalyticsR)

install.packages("searchConsoleR")

library(searchConsoleR)

install.packages("googleAuthR")

library(googleAuthR)

## Installing Multiple Packages

instPacks &lt;- c("tidyverse", "googleAnalyticsR", "searchConsoleR", "googleAuthR")

lapply(instPacks, require, character.only = TRUE)

## Authorise Google Analytics

ga_auth()

## Find Google Analytics Accounts

gaAccounts &lt;- ga_account_list()

viewID &lt;- gaAccounts$viewId[7]

testData &lt;- google_analytics(viewID, date_range = c("2022-02-01","2022-02-08"), metrics = "sessions")

GAData &lt;- google_analytics(viewID, date_range =c("2022-02-01","2022-02-08"), metrics = c("sessions", "users", "pageviews", "bouncerate"))

## Adding Segments

GASegments &lt;- ga_segment_list()

which(GASegments$name == "Organic Traffic")

orgSegment &lt;- segment_ga4("orgSegment", segment_id = "gaid::-5")

GADataOrg &lt;- google_analytics(viewID, date_range =c("2022-02-01","2022-02-08"), metrics = c("sessions", "users", "pageviews",                                                                                        "bouncerate"), segment= orgSegment)

## Date Range Dimension

GADataOrgDates &lt;- google_analytics(viewID, date_range =c("2022-02-01","2022-02-08"), metrics = c("sessions", "users", "pageviews",                                                                                        "bouncerate"), dimensions = "date",segment= orgSegment)

## Dynamic Dates

GADataDynamicDates &lt;- google_analytics(viewID, date_range =c("7DaysAgo","yesterday"), metrics = c("sessions", "users", "pageviews", "bouncerate"), dimensions = "date",segment= orgSegment)

## Search Console Data

scr_auth()

scData &lt;- search_analytics("https://www.ben-johnston.co.uk", startDate = Sys.Date() -7, Sys.Date() -1, searchType = "web")

## Break By Date

scDataByDate &lt;- search_analytics("https://www.ben-johnston.co.uk", startDate = Sys.Date() -7, Sys.Date() -1, searchType = "web", dimensions = "date")</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&amp;linkname=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-2-packages-google-analytics-search-console%2F&#038;title=R%20For%20SEO%20Part%202%3A%20Packages%2C%20Google%20Analytics%20%26%20Search%20Console%20With%20R" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" data-a2a-title="R For SEO Part 2: Packages, Google Analytics &amp; Search Console With R"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO part 4: functions" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 4: Functions</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png" class="attachment-480x480 size-480x480 wp-post-image" alt="" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R For SEO Part One | Ben Johnston" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 1: The Basics</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>R For SEO Part 1: The Basics</title>
      <link>https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Sun, 06 Feb 2022 14:26:36 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[R for SEO]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">http://167.71.131.91/?p=2954</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">R For SEO Part 1: The Basics</a></p>
<p>The R programming language has lots of benefits for SEOs, but it just doesn’t get as much love in the space as...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/">R For SEO Part 1: The Basics</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&#038;title=R%20For%20SEO%20Part%201%3A%20The%20Basics" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" data-a2a-title="R For SEO Part 1: The Basics"></a></p>
<p>The R programming language has lots of benefits for SEOs, but it just doesn’t get as much love in the space as Python. I get it. The barrier to entry is a little higher and there are some things that you can do with Python or other languages that R is just not built for, but when it comes to analysis of chunky datasets or common SEO analytical functions, R can do it just as well as any other language – better in some cases.<br></p>



<p>R was the first programming language I learned “properly” after self-teaching a few bits of different ones here and there and, although I’m moving more towards focusing on Python and Julia these days for some of the work I’m doing, R will always hold a special place in my heart. With that in mind, I wanted to share a series of posts where I’ll show you just how you can use <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a>.</p>



<h2 class="wp-block-heading" id="how-my-r-for-seo-series-will-work">How My R For SEO Series Will Work</h2>



<p>Over the next eight posts, I’ll be taking you from complete R newbie up to the point where we’re doing serious SEO work, like building a rank checker and dashboard with the language, covering functions, visualisation, replicating Excel formulae and using APIs along the way. If you want to be the first to know when a new entry has dropped, sign up for my FREE email list. No spam, no sales, just updates of new content.</p>


<div class="frm_forms  with_frm_style frm_style_formidable-style" id="frm_form_15_container" >
<form enctype="multipart/form-data" method="post" class="frm-show-form  frm_pro_form " id="form_zjp907" >
<div class="frm_form_fields ">
<fieldset>
<div class="frm_fields_container">
<input type="hidden" name="frm_action" value="create" />
<input type="hidden" name="form_id" value="15" />
<input type="hidden" name="frm_hide_fields_15" id="frm_hide_fields_15" value="" />
<input type="hidden" name="form_key" value="zjp907" />
<input type="hidden" name="item_meta[0]" value="" />
<input type="hidden" id="frm_submit_entry_15" name="frm_submit_entry_15" value="2a43233bc9" /><input type="hidden" name="_wp_http_referer" value="/feed/?def2=1738006258" /><div id="frm_field_126_container" class="frm_form_field form-field  frm_required_field frm_top_container">
    <label for="field_v80ub2" class="frm_primary_label">What&#8217;s Your Email Address?
        <span class="frm_required">*</span>
    </label>
    <input type="email" id="field_v80ub2" name="item_meta[126]" value=""  data-reqmsg="What&#039;s Your Email Address? can&#039;t be blank." aria-required="true" data-invmsg="What&#039;s Your Email Address? is invalid" aria-invalid="false"  />
    
    
</div>
<div id="frm_field_241_container" class="frm_form_field form-field ">
<div class="frm_submit">

<input type="submit" value="Sign Up"  class="frm_final_submit" formnovalidate="formnovalidate" />
<img decoding="async" class="frm_ajax_loading" src="https://www.ben-johnston.co.uk/wp-content/plugins/formidable/images/ajax_loader.gif" alt="Sending"/>

</div>
</div>
<input type="hidden" name="item_key" value="" />
<div class="frm__653a75d21b915">
<label for="frm_email_15" >
If you are human, leave this field blank.</label>
<input  id="frm_email_15" type="text" class="frm_verify" name="frm__653a75d21b915" value="" autocomplete="off"  />
</div>
<input name="frm_state" type="hidden" value="DSRtCP0OhNtz570OZTVmXRxR/0XEs1rLj4dmGCGNWbY=" /></div>
</fieldset>
</div>

<p style="display: none !important;" class="akismet-fields-container" data-prefix="ak_"><label>&#916;<textarea name="ak_hp_textarea" cols="45" rows="8" maxlength="100"></textarea></label><input type="hidden" id="ak_js_2" name="ak_js" value="238"/><script>document.getElementById( "ak_js_2" ).setAttribute( "value", ( new Date() ).getTime() );</script></p></form>
</div>



<p></p>



<p>While it’s not a completely exhaustive course on R, my hope is that by the end, you’ll have enough of an understanding of the language to use it in your day-to-day SEO work and be able to find answers to any issues you’re having. I’d also absolutely love it if you’d share this series with your network. Over the next few months, I’ll be doing some more stuff for the R community focusing on SEO, so it would be great to have that amplification.<br><br>Today, we’re going to be covering the very basics of using <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a> and future posts will centre around specific elements of the language. I’ll update this post with links to those when they get published as well. Every post will introduce the concepts that we’re discussing and take you through how they work and, at the end, I’ll demonstrate what we’ve learned with an SEO-specific use case.<br><br>As always, feel free to use the table of contents below to skip around if you’re looking for a specific area and the code that we use in each section will be compiled into one script and posted here.<br><br>OK, let’s get started.</p>



<div class="wp-block-advanced-gutenberg-blocks-summary"><p class="wp-block-advanced-gutenberg-blocks-summary__title">Contents</p><div class="wp-block-advanced-gutenberg-blocks-summary__fold"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-chevron-up"><polyline points="18 15 12 9 6 15"></polyline></svg></div><ol role="directory" class="wp-block-advanced-gutenberg-blocks-summary__list"><li><a href="https://www.ben-johnston.co.uk#how-my-r-for-seo-series-will-work">How My R For SEO Series Will Work</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#why-use-r">Why Use R?</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#installing-r">Installing R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#installing-rstudio">Installing RStudio</a><ol><li><a href="https://www.ben-johnston.co.uk#a-quick-tour-of-rstudio">A Quick Tour of RStudio&nbsp;</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#your-first-rstudio-project">Your First RStudio Project&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#datasets-and-basic-calculations">Datasets&nbsp;And&nbsp;Basic Calculations</a><ol><li><a href="https://www.ben-johnston.co.uk#our-first-r-dataset">Our First R Dataset</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#our-first-calculation-in-r">Our First Calculation In R</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#reading-csv-data-in-r">Reading&nbsp;CSV&nbsp;Data In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#basic-data-exploration-in-r">Basic Data Exploration In R</a><ol><li><a href="https://www.ben-johnston.co.uk#head-and-tail-investigations-in-r-datasets">Head And Tail Investigations&nbsp;In&nbsp;R Datasets&nbsp;</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#sum-average-max-and-minimum-values-in-r">Sum, Average, Max&nbsp;And&nbsp;Minimum Values In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#exploring-data-with-summarise-in-r">Exploring Data&nbsp;With&nbsp;Summarise In R</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#subsetting-data-in-r">Subsetting&nbsp;Data In R</a><ol><li><a href="https://www.ben-johnston.co.uk#how-the-subset-command-works">How&nbsp;The&nbsp;Subset Command Works&nbsp;</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#exporting-to-csv-with-r">Exporting&nbsp;To&nbsp;CSV With R</a><ol><li><a href="https://www.ben-johnston.co.uk#how-writecsv-works-in-r">How Write.csv Works In R</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#and-we’re-done">And We’re Done</a><ol><li><a href="https://www.ben-johnston.co.uk#our-code-from-today">Our Code&nbsp;From&nbsp;Today</a><ol></ol></li></ol></li></ol></div>



<p></p>



<h2 class="wp-block-heading" id="why-use-r">Why Use R?</h2>



<p>I’m not going to get into the R vs Python/ SAS/ Matlab/ Julia debate, but I suppose it’s a worthwhile place to start. Just why would we use R, a statistical programming language, in SEO?</p>



<p>Personally, I’ve always loved data and the pattern recognition that comes with its analysis – pretty similar to why I love SEO. I’ve always used data as my secret weapon within SEO and, when I trained as a data analyst, I was keen to make sure I learned to make the skills as transferrable as possible. These days, we’re seeing a lot of analytical programming being used by SEOs, and R is a fantastic way to leverage that.</p>



<p>Ultimately, in SEO, we’re always working with data, sometimes lots of it, and there are situations where the trusty Excel spreadsheet will either not suffice or the size of the dataset will kill your machine, so it’s worth learning R or similar as an additional string to your bow. Think of it as Excel on steroids and you won’t go far wrong.</p>



<p>As languages go, R <em>is</em> a bit on the limited side compared to others. It’s built for statistical analysis and it does that one thing very well, although as you may have seen with my <a href="https://www.ben-johnston.co.uk/bulk-resizing-images-with-r/">bulk image resizing</a> post, you can make it do plenty of other things as well. For pure data analysis, there aren’t many languages that are better, and the range of packages, visualisation options and, in my opinion, the best IDE on the market makes R a great choice for the data-savvy SEO.</p>



<h2 class="wp-block-heading" id="installing-r">Installing R</h2>



<p>Now that we’re ready to get started, the first thing we have to do is install the R language on our machine. This is really simple.</p>



<p>Firstly, go to the CRAN site at <a href="https://cran.r-project.org/mirrors.html" target="_blank" rel="noreferrer noopener">https://cran.r-project.org/mirrors.html</a> and select the server that’s closest to you.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="855" height="243" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/cran-install.png" alt="install R from CRAN" class="wp-image-2957" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/cran-install.png 855w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/cran-install-300x85.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/cran-install-150x43.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/cran-install-768x218.png 768w" sizes="(max-width: 855px) 100vw, 855px" /></figure>



<p>From here, choose the appropriate download for your machine. R is available on Windows, Linux and Mac, and it’s open source, so there’s really no barrier to getting started.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="945" height="254" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/r-base.png" alt="Install R from Base" class="wp-image-2959" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/r-base.png 945w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/r-base-300x81.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/r-base-150x40.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/12/r-base-768x206.png 768w" sizes="(max-width: 945px) 100vw, 945px" /></figure>



<p>Once your installer has downloaded, install as you usually would any other program. The only change to the installation process I suggest making is unticking “Associate with .Rdata” and .R files. The reason for that is that we’re going to use RStudio for our IDE and we don’t want to just open up the language console every time we double-click on a file.</p>



<p>I generally also recommend only installing the 64-bit version. Using a 32-bit version of the language kind of defeats the object of using this and if you’re still using a 32-bit machine in 2022, you should probably have a serious conversation with yourself or your IT department.</p>



<p>Once everything’s installed, we’re ready to go to the next step: installing RStudio.</p>



<h2 class="wp-block-heading" id="installing-rstudio">Installing RStudio</h2>



<p><a href="https://www.rstudio.com/" target="_blank" rel="noreferrer noopener">RStudio</a> is R’s IDE (Integrated Development Environment) and, in my opinion, the best IDE on the market for data analysis, although I do also love <a href="https://www.jetbrains.com/pycharm/" target="_blank" rel="noreferrer noopener">PyCharm</a> for working in Python.</p>



<p>An IDE like RStudio lets you write and run your code as well as manage your objects, merge everything you’re working on into specific projects and see your visualisations in one window. They’re essential for any programming work that requires analysis, so it’s great that R has one that’s so good.</p>



<p>You can get RStudio from <a href="https://www.rstudio.com/products/rstudio/" target="_blank">https://www.rstudio.com/products/rstudio/</a> &#8211; don’t let the paid options scare you, I’ve been using the free one for years and it’s more than good enough for any application.</p>



<p>Installation is just as simple as any other program, but you’ll want to make sure that it’s associating with the&nbsp;relevant&nbsp;file types.&nbsp;If it’s your first installation, select all options.&nbsp;</p>



<p>OK, now we’re all installed let’s start using R.&nbsp;</p>



<h3 class="wp-block-heading" id="a-quick-tour-of-rstudio">A Quick Tour of RStudio&nbsp;</h3>



<p>The easiest way to show you about how RStudio works is to simply&nbsp;<em>show you</em>&nbsp;how&nbsp;RStudio&nbsp;works, which you’ll find in the image below.&nbsp;</p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="560" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/febKHlQw-1024x560.png" alt="RStudio layout" class="wp-image-2967" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/febKHlQw-1024x560.png 1024w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/febKHlQw-300x164.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/febKHlQw-150x82.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/febKHlQw-768x420.png 768w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/febKHlQw-1536x840.png 1536w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/febKHlQw.png 1919w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Again, we’re not going fully exhaustive here, but hopefully this screenshot will give you an idea of what’s&nbsp;where. We’ll cover the specific functionality of each section shortly, but here’s a top-line overview:&nbsp;</p>



<ol class="wp-block-list"><li><strong>The Script Window:</strong>&nbsp;Where you write and save your code. You can write anything here without breaking anything and it saves down to&nbsp;a .R&nbsp;file, letting you save your code</li><li><strong>The Environment Explorer:&nbsp;</strong>Where you can see the datasets, variables,&nbsp;objects&nbsp;and functions you’ve written into R</li><li><strong>The R Console:</strong>&nbsp;This is where you’ll&nbsp;actually run&nbsp;your code after writing it in the script window</li><li><strong>The Plot Window:</strong>&nbsp;Where you’ll see the graphs you generate. The different tabs also let you see the files in your working directory, the packages you’ve got installed and view help documentation</li></ol>



<p>Now we know what’s where, the best way to get to grips with R is just to start using it, so let’s go.&nbsp;</p>



<h2 class="wp-block-heading" id="your-first-rstudio-project">Your First RStudio Project&nbsp;</h2>



<p>Before we do anything, we need to create an RStudio project, which will store all our&nbsp;datasets, our code and everything else we’re working on. You should do this with every single piece of work you do with R, just so you can go back to it later.&nbsp;</p>



<p>As you get a bit more advanced, you may well want to use version control, which I highly recommend. You can read my guide to <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/git-for-data-analysts-the-complete-guide/"   title="Git for Data Analysts" data-wpil-keyword-link="linked">Git for Data Analysts</a> to get an understanding of how you can use that and what best practices would look like, but we’re not there yet.&nbsp;</p>



<p>First, create a folder that you’d like to save your first project in.&nbsp;</p>



<p>Now in RStudio, click on File and select “New Project” like so.&nbsp;</p>



<figure class="wp-block-image size-full"><img decoding="async" width="220" height="416" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/y8ezFyHQ.png" alt="new RStudio project" class="wp-image-2969" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/y8ezFyHQ.png 220w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/y8ezFyHQ-159x300.png 159w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/y8ezFyHQ-79x150.png 79w" sizes="(max-width: 220px) 100vw, 220px" /></figure>



<p>Now you’ll want to select “Existing Directory” since we’ve just created that folder.&nbsp;</p>



<p>Navigate to your project directory and click “Create Project”.&nbsp;</p>



<figure class="wp-block-image size-full"><img decoding="async" width="533" height="379" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/find-directory.png" alt="Create RStudio project in existing directory" class="wp-image-2970" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/find-directory.png 533w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/find-directory-300x213.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/find-directory-150x107.png 150w" sizes="(max-width: 533px) 100vw, 533px" /></figure>



<p>Finally, click File and “New Script” to give yourself&nbsp;a .R&nbsp;file to&nbsp;write and&nbsp;store your&nbsp;code.</p>



<p>OK, great. Now we’re ready to get cracking on our first R project. But first, we need to get some understanding of how it works.</p>



<h2 class="wp-block-heading" id="datasets-and-basic-calculations">Datasets&nbsp;And&nbsp;Basic Calculations</h2>



<p>Since we’re starting from the very beginning here, we need to start from the actual beginning – creating&nbsp;datasets&nbsp;and basic calculations. Let’s not run before we can walk.</p>



<p>Let’s create our first&nbsp;data frame&nbsp;– essentially, a&nbsp;dataset in its own right.</p>



<p>There are three stages to a data frame:&nbsp;</p>



<ol class="wp-block-list"><li><strong>The name:&nbsp;</strong>What we call this variable. It’s always best to choose the most descriptive name you can so you can figure out what it’s doing later</li><li><strong>The function:</strong>&nbsp;This isn’t always essential if your variable is just a number or a string of text, but sometimes, you’ll want to call a function. More on that later</li><li><strong>The data:</strong>&nbsp;What information are we including in this variable? This is where we define it&nbsp;</li></ol>



<p>Again, we’re only at the basics here, so these aren’t always necessary here, and sometimes as we progress, there will be more elements.</p>



<h3 class="wp-block-heading" id="our-first-r-dataset">Our First R Dataset</h3>



<p>Let’s create and store our first object:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">x &lt;- 2</pre>



<p>Seems simple, right?&nbsp;That’s because&nbsp;it is, but let’s break it down anyway.&nbsp;</p>



<ul class="wp-block-list"><li><strong>x:</strong>&nbsp;The name of our dataset. I know I said to keep your names descriptive, but all shall become clear later&nbsp;</li><li><strong>&lt;-:</strong>&nbsp;The arrow is the most common way of telling R that the name we typed is a dataset name. Some people use the = symbol, which also works fine, but in R, &lt;- is the standard way of doing it&nbsp;</li><li><strong>2:</strong>&nbsp;The value of our dataset. This will be expanded exponentially&nbsp;later on, but we’re starting at the basics right now&nbsp;</li></ul>



<p>So, following that, we can tell that our dataset called “x” has a value of 2.&nbsp;</p>



<p>Paste this into your console window and hit enter, and you should see it in your RStudio Environment Explorer.&nbsp;</p>



<h3 class="wp-block-heading" id="our-first-calculation-in-r">Our First Calculation In R</h3>



<p>Now we’ve got our first object, x, with a value of 2, let’s create another one before we start our first calculation in R.&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">y &lt;- 3</pre>



<p>Now we’ve got a second variable called y with a value of 3. What if we created a new object called z where we try different calculations of the two?&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">z &lt;- x+y</pre>



<p>As you may have guessed, here we’re adding our x and y datasets and creating a new object called z with the total.&nbsp;</p>



<p>Try it out. You can just type your dataset name into the R console in RStudio to print the value in the console, or type print(z), whichever you prefer.&nbsp;</p>



<p>If you used the same values I used for the datasets, you should be given the following in the R console:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[1] 5</pre>



<p>Here, you can see the row number of our output ([1] in this case, since there’s only one row in this dataset), and the value of it. 5, in this case.&nbsp;</p>



<p>You can do the same thing with other common mathematical calculations:&nbsp;</p>



<ul class="wp-block-list"><li><strong>z &lt;- x*y:</strong>&nbsp;This will multiply x and y&nbsp;</li><li><strong>z &lt;- x-y:</strong>&nbsp;This will subtract y from x&nbsp;</li><li><strong>z &lt;- x/y:</strong>&nbsp;This will divide x by y&nbsp;</li></ul>



<p>It’s worth knowing that you can overwrite your dataset names at any time whenever you need to make changes to your code, so you can use the same dataset multiple times, but it’s generally not best practice to do so unless you’ve made a mistake. More on that later.&nbsp;</p>



<p>But we didn’t start learning R to just do basic calculations, did we? Let’s start using it properly on some real SEO data.&nbsp;</p>



<p>Of course,&nbsp;in order to&nbsp;do that, we’re going to need some data to work with. Let’s import a spreadsheet and work with that.</p>



<h2 class="wp-block-heading" id="reading-csv-data-in-r">Reading&nbsp;CSV&nbsp;Data In R</h2>



<p>Since we’re focusing our R learning on SEO, it makes sense for our first project to be&nbsp;using SEO data, so why not use a Google Search Console export? As we go forward, I’ll show you how to get this directly from the <a href="https://www.ben-johnston.co.uk/r-for-seo-part-6-using-apis-in-r/"  data-wpil-monitor-id="215">API</a>, but that’s another lesson for another time. For now, just do the standard Google Search Console export of your queries.&nbsp;</p>



<p>This should give you a file called “Queries.csv”. Move that file into your project directory.&nbsp;</p>



<p>Now we’re going to use a command&nbsp;which you will be using an awful lot with your time in R – read.csv.&nbsp;</p>



<p>This command reads the contents of a CSV file into your R environment, retaining the structure that you’d see if you opened it in Excel. Let’s see how it works.&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queries &lt;- read.csv("Queries.csv", stringsAsFactors = FALSE)</pre>



<p>There it&nbsp;is,&nbsp;your first piece of “proper” R code. Feels&nbsp;pretty cool, right? Put this in your script editor window and, when you’re happy with it, paste it into your console and hit enter.&nbsp;</p>



<p>You’ll see a dataset called “queries” pop up in the environment explorer in the top right of your RStudio window.</p>



<p>Let’s break it down. As we go through, I’m not going to break down every single command, but this is our first, so it makes sense to.</p>



<ul class="wp-block-list"><li><strong>queries &lt;-</strong>: We’re telling R that we want to call our new dataset “queries”. Again, the “&lt;-“ command&nbsp;is similar to the “=” you’d see in other languages like Python. = does&nbsp;actually work&nbsp;in R, but it’s most common to use &lt;- to define our datasets</li><li><strong>read.csv:</strong>&nbsp;The name of our&nbsp;command – in this case, the function to read a CSV file into our environment is helpfully named “read.csv”. Don’t get too used to this, there are some <a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/"  data-wpil-monitor-id="212">R functions</a> that have me wondering why you’d ever call them that, but that’s part and parcel of any programming language</li><li><strong>(“Queries.csv”,:</strong>&nbsp;The name of the CSV file we’re reading in. Don’t forget the quotation marks or to put the full name of the file in</li><li><strong>stringsAsFactors&nbsp;= FALSE):</strong>&nbsp;There’s a&nbsp;<a href="https://www.r-bloggers.com/2018/03/r-tip-use-stringsasfactors-false/" target="_blank" rel="noreferrer noopener">fantastic article on R-Bloggers</a>&nbsp;about what this does and what you should use it for, but in general, factors are absolutely terrible things and&nbsp;you will almost never want your data to be read in using factors, so I would say for 99% of the occasions you’re reading in CSV data, the&nbsp;stringsAsFactors&nbsp;= FALSE command should be added</li></ul>



<p>And there we have it, we’ve got some data to work with.</p>



<p>One of the key differences between analysing data with code compared to analysing data through Excel is that&nbsp;you can’t&nbsp;actually&nbsp;<em>see</em>&nbsp;it in code compared to looking at a spreadsheet. But not to worry, R has&nbsp;a number of&nbsp;really easy ways to explore and investigate your SEO data.</p>



<h2 class="wp-block-heading" id="basic-data-exploration-in-r">Basic Data Exploration In R</h2>



<p>There are a few key functions that we can use to explore datasets in R:&nbsp;</p>



<ul class="wp-block-list"><li><strong>nrow:&nbsp;</strong>Tells us how many entries we have in there&nbsp;</li><li><strong>str:&nbsp;</strong>Tells us the structure of the dataset – the headers, the type of data it is and gives us a couple of figures from the top of the dataset</li><li><strong>head:&nbsp;</strong>Shows the top results from the dataset&nbsp;</li><li><strong>tail:&nbsp;</strong>Shows the bottom results from the dataset&nbsp;</li></ul>



<p>There are a few others, which we’ll look at in a second.&nbsp;</p>



<p>First, we want to see how many rows we have in our dataset. You can just look in the Environment Explorer to the top right like so:&nbsp;</p>



<p>But if we want to see it in our Console window, we’d type the following:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">nrow(queries)</pre>



<p>This will return the number of rows in our dataset. In my case, it’s given me 646. Your number will almost certainly be different (and probably higher if you spend any actual time working on your site, which I am very bad for).</p>



<figure class="wp-block-image size-full"><img decoding="async" width="279" height="273" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/nrow.png" alt="nrow() command in R" class="wp-image-2978" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/nrow.png 279w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/nrow-150x147.png 150w" sizes="(max-width: 279px) 100vw, 279px" /></figure>



<p>Now let’s see what the structure looks like. What headers do we have in our dataset?&nbsp;</p>



<p>To do that, type the following:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">str(queries)</pre>



<p>This will return the following output:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="792" height="346" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/str.png" alt="str() command in R" class="wp-image-2979" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/str.png 792w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/str-300x131.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/str-150x66.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/01/str-768x336.png 768w" sizes="(max-width: 792px) 100vw, 792px" /></figure>



<p>As you can see, we’ve got the following headers in our Search Console dataset:&nbsp;</p>



<ul class="wp-block-list"><li><strong>Top.queries:</strong>&nbsp;The search terms that people have used to find our website&nbsp;</li><li><strong>Clicks:</strong>&nbsp;The number of times each search term has been clicked&nbsp;</li><li><strong>Impressions:</strong>&nbsp;The number of times each search term has been seen in search results</li><li><strong>CTR:</strong>&nbsp;The percentage of clicks to impressions&nbsp;</li><li><strong>Position:</strong>&nbsp;The average position that the search term has been seen in&nbsp;</li></ul>



<p>This information is obviously&nbsp;really useful&nbsp;to us as SEOs and over the next series of posts, we’ll be looking at how we can use it more.</p>



<h3 class="wp-block-heading" id="head-and-tail-investigations-in-r-datasets">Head And Tail Investigations&nbsp;In&nbsp;R Datasets&nbsp;</h3>



<p>Now for further investigation, let’s&nbsp;take a look&nbsp;at the top and bottom values of our Search Console dataset using R’s “head” and “tail” functions.&nbsp;</p>



<p>To see the top of your dataset, in your console, type:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">head(queries)</pre>



<p>And to see the bottom of it, type:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tail(queries)</pre>



<p>These will show the top 20 and bottom 20 results&nbsp;respectively, and&nbsp;can be vital in exploring your datasets.</p>



<h3 class="wp-block-heading" id="sum-average-max-and-minimum-values-in-r">Sum, Average, Max&nbsp;And&nbsp;Minimum Values In R</h3>



<p>This is all&nbsp;really useful, but sometimes we just want to see the maximum, minimum, average or total values of a dataset or a variable. Here’s how you do that, using our Google Search Console data.&nbsp;</p>



<p>Firstly, we need to understand how to focus on a specific variable (similar to&nbsp;a column) in a dataset rather than the dataset as a whole. This is&nbsp;really easy&nbsp;– you simply type your dataset name and add the $ character. If you know the name of your variable, which you can find from the&nbsp;str() command, or RStudio will give you a lovely dropdown menu which will autocomplete. I told you RStudio was great!&nbsp;</p>



<p>To see the total value of your Impressions variable (similar to&nbsp;running =SUM on a column in Excel), type:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sum(queries$Impressions)</pre>



<p>To see the mean average value of your Impressions variable (similar to&nbsp;using =AVERAGE on a column in Excel), type:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">mean(queries$Impressions)</pre>



<p>And if you want to see the median value of your Impressions, you’d type:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">median(queries$Impressions)</pre>



<p>Now let’s see what the maximum and minimum values are. We’ll cover this in more detail in <a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/"  data-wpil-monitor-id="211">part 5 when we start replicating common Excel</a> functions in R.&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">max(queries$Impressions)</pre>



<p>This will show you the highest number of impressions you’ve had.&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">min(queries$Impressions)</pre>



<p>And this will show you the lowest number.&nbsp;</p>



<h3 class="wp-block-heading" id="exploring-data-with-summarise-in-r">Exploring Data&nbsp;With&nbsp;Summarise In R</h3>



<p>Finally, let’s create a&nbsp;summary&nbsp;of our impressions data, so we can see all these values in one go. For bonus points, we’ll create a&nbsp;dataframe&nbsp;of it, so we can review it at any time.&nbsp;</p>



<p>To use the summarise (summarize if you’re installing in EN-US) function, we’re going to be identifying specific headers from the dataset rather than using&nbsp;it as a whole. To do that, we’re going to use the “$” symbol to call out the specific headers we want to look at, which our&nbsp;str() command will help us find (although RStudio will also autocorrect for us).&nbsp;&nbsp;</p>



<p>Let’s get a summary of our impressions.&nbsp;</p>



<p>Type:&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">impVals &lt;- summary(queries$Impressions)</pre>



<p>This will give you a new variable called&nbsp;impVals&nbsp;(short for Impression Values – remember how I was saying about giving variables memorable names?), which incorporates the following:&nbsp;</p>



<ul class="wp-block-list"><li><strong>Min:</strong>&nbsp;The lowest number in our range&nbsp;</li><li><strong>1<sup>st</sup>&nbsp;Qu:</strong>&nbsp;The first quartile, or the point at which 25% of the data is cut off in ascending order&nbsp;</li><li><strong>Median:</strong>&nbsp;The middle average value&nbsp;</li><li><strong>Mean:</strong>&nbsp;The average value&nbsp;</li><li><strong>3<sup>rd</sup>&nbsp;Qu:</strong>&nbsp;The third quartile, the point at which 75% of the data is cut off in ascending order</li><li><strong>Max:</strong>&nbsp;The highest value in the range&nbsp;</li></ul>



<p>Summaries can be incredibly useful when exploring data. In fact, it’s quite rare that I don’t have a summary in my R environment of most of the pieces of data I work with, because they’re just so handy to have for reference.</p>



<h2 class="wp-block-heading" id="subsetting-data-in-r">Subsetting&nbsp;Data In R</h2>



<p>One of the unfortunate truths of working with data is that most datasets will have a lot of information in them that you don’t want.&nbsp;From irrelevant queries,&nbsp;to numbers too small to be useful, there will be times that you want to just cut information from your dataset, or focus on one specific element. That’s where&nbsp;subsetting&nbsp;comes in.&nbsp;&nbsp;</p>



<p>Subsetting&nbsp;is a hugely powerful tool for your SEO analysis, so it’s well worth learning how to do it with R. We’ll be using it in the coming articles, so what better time to go through how it works?&nbsp;</p>



<p>Google Search Console data&nbsp;in particular is&nbsp;a prime candidate for&nbsp;subsetting&nbsp;since you’ll wind up with a lot of random queries in there at times, which are worth cutting out.&nbsp;</p>



<p>Let’s look at our Google Search Console dataset that we’ve already read into our environment. From the summary in my example, we can see that there are a lot of queries that have only had one impression. We can say that we don’t want to include these in our analysis.&nbsp;</p>



<p>What we’re going to do here is to create a new dataset based on our queries&nbsp;dataset, but&nbsp;cutting out any queries with 10 or fewer impressions. This would take a few steps to do in Excel, but thankfully, we can do it in one line with R.&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queriesSub &lt;- subset(queries, Impressions >=20)</pre>



<p>That’s it. Let’s break it down:</p>



<h3 class="wp-block-heading" id="how-the-subset-command-works">How&nbsp;The&nbsp;Subset Command Works&nbsp;</h3>



<p>The subset command above works as follows:&nbsp;</p>



<ul class="wp-block-list"><li><strong>queriesSub&nbsp;&lt;-:</strong>&nbsp;Our dataset’s name</li><li><strong>subset(:</strong>&nbsp;We’re telling R that subset is the function we want to use&nbsp;</li><li><strong>queries:</strong>&nbsp;The name of the dataset we’re working on (queries in this case, the Search Console data we imported earlier)&nbsp;</li><li><strong>Impressions:</strong>&nbsp;The variable within the dataset that we want to focus on (Impressions in this case)&nbsp;</li><li><strong>&gt;=20):</strong>&nbsp;We’re telling R that we want to only include queries that have 20 or more impressions &#8211; &gt;= means “Greater than or equal to”. Don’t forget your closing bracket&nbsp;</li></ul>



<p>Now we have a new dataset called&nbsp;queriesSub&nbsp;without as many junk impressions included. You can also use a range of other commands with subsets, such as &lt;= (less than or equal to), == (exactly equal to – don&#8217;t forget the double = symbol for an exact match) and more. You can also subset based on specific text strings and much more. We’ll cover more of that in future pieces.&nbsp;</p>



<h2 class="wp-block-heading" id="exporting-to-csv-with-r">Exporting&nbsp;To&nbsp;CSV With R</h2>



<p>As you work further through your analysis, there will certainly be times that you need to export your data to CSV. Perhaps you need to share it, maybe you want to use the&nbsp;information in another tool or spreadsheet, whatever. The point is, you’ll need to do it. Fortunately, this is very simple in R using the write.csv command.&nbsp;</p>



<p>Let’s&nbsp;export the subset we created of queries with more than ten impressions.&nbsp;&nbsp;</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">write.csv(queriesSub, "queriesSub.csv")</pre>



<p>Nice and simple. Now if you look in the files pane in RStudio, you’ll see your new file, and it’ll be in your project directory too, all ready for sharing or importing into something else.&nbsp;&nbsp;</p>



<p>Let’s break it down.</p>



<h3 class="wp-block-heading" id="how-writecsv-works-in-r">How Write.csv Works In R</h3>



<p>Write.csv is a base R command, meaning you don’t need any&nbsp;extra packages or dependencies to run it.&nbsp;It works as follows:</p>



<ul class="wp-block-list"><li><strong>write.csv(:</strong>&nbsp;Here, we’re telling R that we want to use the write.csv command and export a specific dataset to the csv format. Others such as write.txt are available if you need a text file export, they all work largely the same way&nbsp;</li><li><strong>queriesSub:&nbsp;</strong>We’re saying that the dataset that should be written to CSV is our&nbsp;queriesSub&nbsp;dataset. As you do more with R, you’ll be changing this a lot, but it’s a nice and simple command&nbsp;</li><li><strong>“queriesSub.csv”):&nbsp;</strong>Here, we’re naming the file that we want to export to. Nice and simple, but don’t forget the quote marks&nbsp;</li></ul>



<p>There we go. Now you know how to do some of the basics in R, which will hopefully help you elevate your SEO&nbsp;analysis game.</p>



<h2 class="wp-block-heading" id="and-we’re-done">And We’re Done</h2>



<p>Thanks for working your way through the first instalment in my <a href="https://www.ben-johnston.co.uk/category/r/r-seo/">R for SEO</a> series. Hopefully now you’ve got a grasp of the basics of R and you’re all ready for the next piece where we’ll be covering Packages, Google Analytics and Google Search Console data. </p>



<p>This is where you’ll start to see it coming together, so I really hope you’ll join me next week.&nbsp;</p>



<p>If you’re&nbsp;going to&nbsp;be&nbsp;along for the ride, thank you. Sign up for my mailing list below and you’ll get an email notification of when it gets published and you can work through the exercises at your leisure, and if you have any questions, hit me up on <a href="https://x.com/ben_johnston429" target="_blank" rel="noreferrer noopener">Twitter</a>.</p>



<p>Until next week.</p>



<h3 class="wp-block-heading" id="our-code-from-today">Our Code&nbsp;From&nbsp;Today</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># First Variables

x &lt;- 2

y &lt;- 3

# First Calculations

z &lt;- x+y

print(z)

z &lt;- x*y

z &lt;- x-y

z &lt;- x/y

# Read In Search Console Data

queries &lt;- read.csv("Queries.csv", stringsAsFactors = FALSE)

# Explore Search Console Data

nrow(queries)

str(queries)

head(queries)

tail(queries)

## Further Exploration

sum(queries$Impressions)

mean(queries$Impressions)

median(queries$Impressions)

max(queries$Impressions)

min(queries$Impressions)

## Summary

impVals &lt;- summary(queries$Impressions)

# Subsetting

## Subset Queries To 10 Or More Impressions

queriesSub &lt;- subset(queries, Impressions >=10)

## Subset Queries To 50 Or Fewer Impressions

queriesSub &lt;- subset(queries, Impressions &lt;=50)

## Subset Queries To Exactly 50

queriesSub &lt;- subset(queries, Impressions ==50)</pre>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&amp;linkname=R%20For%20SEO%20Part%201%3A%20The%20Basics" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fr-for-seo-part-1-basics%2F&#038;title=R%20For%20SEO%20Part%201%3A%20The%20Basics" data-a2a-url="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" data-a2a-title="R For SEO Part 1: The Basics"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png" class="attachment-480x480 size-480x480 wp-post-image" alt="" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-5-common-excel-formulas-in-r/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 5" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/02/r-for-seo-part-5-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 5: Common Excel Formulas In R</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO part 4: functions" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2024/01/r-for-seo-part-4-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 4: Functions</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 2: Packages" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
    <item>
      <title>Keyword &amp; Topic Clustering For SEO With R</title>
      <link>https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/</link>
      <dc:creator><![CDATA[Ben Johnston]]></dc:creator>
      <pubDate>Tue, 09 Feb 2021 21:16:28 +0000</pubDate>
      <category><![CDATA[R]]></category>
      <category><![CDATA[SEO]]></category>
      <guid isPermaLink="false">http://167.71.131.91/?p=2878</guid>
      <description><![CDATA[<p><a href="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/">Keyword &#038; Topic Clustering For SEO With R</a></p>
<p>Keyword and topic clustering for SEO has been a hot topic for years, but one of the things I’ve noticed is that...</p>
<p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></description>
      <content:encoded><![CDATA[<p><a href="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/">Keyword &#038; Topic Clustering For SEO With R</a></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&#038;title=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" data-a2a-url="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/" data-a2a-title="Keyword &amp; Topic Clustering For SEO With R"></a></p>
<p>Keyword and topic clustering for SEO has been a hot topic for years, but one of the things I’ve noticed is that there’s been a distinct lack of discussion around how to actually do it. There’s just been a torrent of theory and a handful of (rather expensive) tools which say they’ll do it for you. With that in mind, I thought I’d spend a bit of time putting together a really basic guide using R to get you started.</p>



<p>I’ve been using keyword and topic clustering as part of my SEO keyword research and content planning approaches for years, also incorporating <a href="https://www.ben-johnston.co.uk/sentiment-analysis-for-seo-using-google-sheets/" target="_blank" rel="noreferrer noopener">sentiment analysis</a> and a couple of other fun areas to help my teams <em>really</em> target their content. While I’m not quite ready to give up all my secrets, I see so much discussion without anyone ever showing their working, doing it manually or trying to sell their tools that I wanted to help people make a bit of a start using free, open-source software.</p>



<p>Ready to get started? Cool. </p>



<p>If you’ve got some familiarity with these methodologies, feel free to skip around using the table of contents below and if you find this useful and you’d like more content like this in your inbox, please sign up for my <strong>free</strong> email newsletter.</p>



<div class="wp-block-advanced-gutenberg-blocks-summary"><p class="wp-block-advanced-gutenberg-blocks-summary__title">Contents</p><div class="wp-block-advanced-gutenberg-blocks-summary__fold"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-chevron-up"><polyline points="18 15 12 9 6 15"></polyline></svg></div><ol role="directory" class="wp-block-advanced-gutenberg-blocks-summary__list"><li><a href="https://www.ben-johnston.co.uk#what-you’ll-need">What You’ll Need</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#read-in-your-data">Read In Your Data</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#preparing-your-text-data-for-clustering">Preparing Your Text Data For Clustering</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#cleaning-a-corpus">Cleaning a Corpus</a><ol><li><a href="https://www.ben-johnston.co.uk#a-corpus-cleaning-function-for-r">A Corpus-Cleaning Function For R</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#using-k-means-clustering-for-keywords-in-r">Using K-Means Clustering For Keywords In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#finding-the-optimal-number-of-clusters">Finding The Optimal Number Of Clusters</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#the-elbow-method">The Elbow Method</a><ol><li><a href="https://www.ben-johnston.co.uk#running-the-elbow-method-in-r">Running The Elbow Method In R</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#visualising-the-elbow-method-using-ggplot2">Visualising The Elbow Method Using GGPlot2</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#using-the-elbow-method-to-identify-the-optimal-number-of-clusters">Using The Elbow Method To Identify The Optimal Number Of Clusters</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#what-can-you-do-with-this">What Can You Do With This?</a><ol><li><a href="https://www.ben-johnston.co.uk#explore-your-clusters-by-subsetting">Explore Your Clusters By Subsetting</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#create-wordclouds-by-cluster">Create Wordclouds By Cluster</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#plotting-a-combo-chart-with-r-amp;-ggplot">Plotting A Combo Chart With R &amp; GGPlot</a><ol></ol></li></ol></li><li><a href="https://www.ben-johnston.co.uk#wrapping-up">Wrapping Up</a><ol></ol></li><li><a href="https://www.ben-johnston.co.uk#the-full-keyword-and-topic-clustering-for-seo-r-script">The Full Keyword And Topic Clustering For SEO R Script</a><ol></ol></li></ol></div>



<p></p>


<div class="frm_forms  with_frm_style frm_style_formidable-style" id="frm_form_15_container" >
<form enctype="multipart/form-data" method="post" class="frm-show-form  frm_pro_form " id="form_zjp907" >
<div class="frm_form_fields ">
<fieldset>
<div class="frm_fields_container">
<input type="hidden" name="frm_action" value="create" />
<input type="hidden" name="form_id" value="15" />
<input type="hidden" name="frm_hide_fields_15" id="frm_hide_fields_15" value="" />
<input type="hidden" name="form_key" value="zjp907" />
<input type="hidden" name="item_meta[0]" value="" />
<input type="hidden" id="frm_submit_entry_15" name="frm_submit_entry_15" value="2a43233bc9" /><input type="hidden" name="_wp_http_referer" value="/feed/?def2=1738006258" /><div id="frm_field_126_container" class="frm_form_field form-field  frm_required_field frm_top_container">
    <label for="field_v80ub2" class="frm_primary_label">What&#8217;s Your Email Address?
        <span class="frm_required">*</span>
    </label>
    <input type="email" id="field_v80ub2" name="item_meta[126]" value=""  data-reqmsg="What&#039;s Your Email Address? can&#039;t be blank." aria-required="true" data-invmsg="What&#039;s Your Email Address? is invalid" aria-invalid="false"  />
    
    
</div>
<div id="frm_field_241_container" class="frm_form_field form-field ">
<div class="frm_submit">

<input type="submit" value="Sign Up"  class="frm_final_submit" formnovalidate="formnovalidate" />
<img decoding="async" class="frm_ajax_loading" src="https://www.ben-johnston.co.uk/wp-content/plugins/formidable/images/ajax_loader.gif" alt="Sending"/>

</div>
</div>
<input type="hidden" name="item_key" value="" />
<div class="frm__653a75d21b915">
<label for="frm_email_15" >
If you are human, leave this field blank.</label>
<input  id="frm_email_15" type="text" class="frm_verify" name="frm__653a75d21b915" value="" autocomplete="off"  />
</div>
<input name="frm_state" type="hidden" value="DSRtCP0OhNtz570OZTVmXYv93RQiRkEQhleqF8tDYPQ=" /></div>
</fieldset>
</div>

<p style="display: none !important;" class="akismet-fields-container" data-prefix="ak_"><label>&#916;<textarea name="ak_hp_textarea" cols="45" rows="8" maxlength="100"></textarea></label><input type="hidden" id="ak_js_3" name="ak_js" value="130"/><script>document.getElementById( "ak_js_3" ).setAttribute( "value", ( new Date() ).getTime() );</script></p></form>
</div>



<p></p>



<h2 class="wp-block-heading" id="what-you’ll-need">What You’ll Need</h2>



<p>Before getting started, you’ll need the following things:</p>



<ul class="wp-block-list"><li><strong>Some keyword data in CSV format:</strong> Doesn’t need to be a lot and it doesn’t really matter where you got it from, you’ll just need to be aware of the column headers and edit the code accordingly. For this example, I’ve used my Google Search Console data from this site</li><li><strong>R:</strong> The open-source statistical language. You can get it from <a href="https://cran.r-project.org/mirrors.html" target="_blank" rel="noreferrer noopener">here</a></li><li><strong>RStudio:</strong> The best IDE for R and a lot of other languages too. You can get it from <a href="https://rstudio.com/products/rstudio/download/" target="_blank" rel="noreferrer noopener">here</a></li><li><strong>The TM package for R: </strong>Packages are like plugins for the language which contain a lot of pre-built functions for specific tasks. The TM package is the best for text mining, which we’ll need to do during this process</li><li><strong><strong>The Tidyverse package for R:</strong> </strong>I honestly can’t imagine a situation where I open R and don&#8217;t get Tidyverse loaded. The Tidyverse package is the definitive collection of other packages to make working with data and visualising it a lot more effective and actually fun</li></ul>



<p>That’s it. You have spent precisely zero pennies to do this!</p>



<h2 class="wp-block-heading" id="read-in-your-data">Read In Your Data</h2>



<p>I’m going to try and be pretty thorough with this piece, but I’m also not going to be writing a step-by-step guide to using R. There’s some assumed knowledge on your part, but where required, I’ll be linking to relevant resources so you can learn more about the language and the process.</p>



<p>The first thing we need to do when working with any data in R is to actually read the data into our environment. After you’ve created your RStudio project, get your dataset in CSV format into your working directory and use the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queries &lt;- read.csv("Queries.csv", stringsAsFactors = FALSE)</pre>



<p>Rename “Queries.csv” as whatever you’ve called the file with your keyword data.</p>



<h2 class="wp-block-heading" id="preparing-your-text-data-for-clustering">Preparing Your Text Data For Clustering</h2>



<p>Clustering is primarily a numerical function, so we’re going to need to make our text workable in a numerical world. The way we’ll do this is to turn our keywords into a <a href="https://cran.r-project.org/web/packages/tidytext/vignettes/tidying_casting.html" target="_blank" rel="noreferrer noopener">Document Term Matrix</a> using the Corpus function, and then we’ll clean that corpus up in line with best practice for text analysis.</p>



<p>Firstly, after reading in our data, we want to make sure that we’ve got our packages installed and live in our R environment. You can do this with the following commands:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">install.packages("tm")

install.packages(“wordcloud”)

install.packages("tidyverse")

library(tm)

library(wordcloud)

library(tidyverse)
</pre>



<p>If you want to cut down on the amount of code you’re using when installing packages, you can use the combine and lapply functions like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">instPacks &lt;- c("tidyverse", "tm", "wordcloud")

lapply(instPacks, require, character.only = TRUE)
</pre>



<p>Now we’ve got our packages in place, we need to create that Document Term Matrix or corpus from our text. We do that with the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">dtm &lt;- Corpus(VectorSource(queries$Query))</pre>



<p>By doing this, we’ve turned our Search Console queries into a Corpus in our R environment, and it’s ready to work with. Again, adapt the “queries$Query” to whatever output you have to work with.</p>



<h2 class="wp-block-heading" id="cleaning-a-corpus">Cleaning a Corpus</h2>



<p>When you’re working with text data in R, there are a few steps that you should take as standard, in order to ensure that you’re working with the most important words and also eliminating possible duplication due to capitalisation and punctuation.</p>



<p>I generally recommend doing the following to every corpus:</p>



<ul class="wp-block-list"><li><strong>Change all text to lower case:</strong> This brings consistency, rather than including duplicates caused by capitalisation</li><li><strong>Turn it to a plain-text document:</strong> This eliminates any possible rogue characters</li><li><strong>Remove punctuation:</strong> Eliminates duplication caused by punctuation</li><li><strong>Stem your words:</strong> By cutting extensions from the words in your corpus, you eliminate the duplication caused by adding “S” to some words, for example</li></ul>



<p>In a lot of cases, I’d usually remove stopwords (terms such as “And”), but in this case, it’s worth keeping them since we’re going to be working with full queries rather than fragments.</p>



<p>Here’s how to clean that corpus up.</p>



<h3 class="wp-block-heading" id="a-corpus-cleaning-function-for-r">A Corpus-Cleaning Function For R</h3>



<p><a href="https://www.ben-johnston.co.uk/r-for-seo-part-4-functions/" target="_blank" rel="noreferrer noopener">Functions in R</a> are a way to make particular commands or pieces of code reproducible. Rather than entering a series of commands every time you need to use them, you can just wrap them into a function and use that function every time. They’re a time-saver and a great way to ensure that your code is easier to work with, as well as easier to share.</p>



<p>Based on the criteria above, here’s an R function to help you clean your keyword corpus up to prepare it for clustering your keywords:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">corpusClean &lt;- function(x){
  
  lowCase &lt;- tm_map(x, tolower)
  
  plainText &lt;- tm_map(lowCase, PlainTextDocument)
  
  remPunc &lt;- tm_map(lowCase, removePunctuation)
  
  stemDoc &lt;- tm_map(remPunc, stemDocument)
  
  output &lt;- DocumentTermMatrix(stemDoc) 
}</pre>



<p>Here, we’re using the TM packages’ built-in functionality to transform our corpus in the ways described above. Again, I’m not going to go into the minutiae of how this works – I don’t think I’ve got another 11k+ word post in me today – but I hope the notation is reasonably clear.</p>



<p>Paste this function into your R console and then we need to actually run it on our corpus. This is really easy:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">corpusCleaned &lt;- corpusClean(dtm)</pre>



<p>We’ve created a new variable called corpusCleaned and run our function on the previous dtm variable, so we have a cleaned-up version of our original corpus. Now we’re ready to start clustering them into topics.</p>



<h2 class="wp-block-heading" id="using-k-means-clustering-for-keywords-in-r">Using <em>K</em>-Means Clustering For Keywords In R</h2>



<p>There are loads of clustering models out there that can be used for keywords – some work better than others, but the best one to start with is always the tried and trusted <em>k</em>-means clustering. Once you’ve done a little bit of work with this, you can explore other models, but for me, this is the best place to start.</p>



<p><em>K</em>-means clustering is one of the most popular unsupervised machine learning models and it works by calculating the distance between different numerical vectors and grouping them accordingly. </p>



<p><a href="https://en.wikipedia.org/wiki/K-means_clustering" target="_blank" rel="noreferrer noopener">Wikipedia </a>is going to explain the math far better than I can, so just look at that link if you’re interested, but what we’re here to talk about today is how this can be used to cluster your search terms or target keywords into topics.</p>



<p>Now that we’ve cleaned up our corpus and gotten it into a state where the <em>k</em>-means algorithm can tokenise the terms and match them up to numeric values, we’re ready to start clustering.</p>



<h2 class="wp-block-heading" id="finding-the-optimal-number-of-clusters">Finding The Optimal Number Of Clusters</h2>



<p>One of the biggest errors I see when people try to <a class="wpil_keyword_link" href="https://www.ben-johnston.co.uk/r-for-seo-part-8-apply-methods-in-r/"   title="apply" data-wpil-keyword-link="linked"  data-wpil-monitor-id="226">apply</a> data analytics techniques to digital marketing and SEO is that they never actually make the analysis useful, they just make a pretty graph for a pitch and the actual output is never usable. That’s certainly a possible failing of topic and keyword clustering if you’re not smart about it, which is why we’re going to run through how to identify the optimal number of clusters.</p>



<p>There are lots of different ways that you can identify the optimal number of clusters – I’m partial to a <a href="https://datascienceplus.com/finding-optimal-number-of-clusters/#:~:text=Bayesian%20Inference%20Criterion%20for%20k%20means&amp;text=Determine%20the%20optimal%20model%20and,the%20modelNames%20parameter%20to%20mclust." target="_blank" rel="noreferrer noopener">Bayesian inference criterion</a>, myself, although good luck getting that to run quickly in R. Since today we’re just doing an introduction, I’m going to take you through the most commonly-used way to identify the best number of topic clusters for your keywords: the Elbow Method.</p>



<h2 class="wp-block-heading" id="the-elbow-method">The Elbow Method</h2>



<p>The <a href="https://en.wikipedia.org/wiki/Elbow_method_(clustering)" target="_blank" rel="noreferrer noopener">Elbow Method</a> is probably the easiest way to find the optimal number of clusters (or <em>k</em>), and it’s certainly the fastest way to process it in R, but that still doesn’t mean it’s particularly quick.</p>



<p>Essentially, the Elbow Method computes the variance between the different terms and sees how many different clusters these could be put in up to the point that adding another cluster doesn’t provide better modelling of the data. In other words, we use this model to identify the point at which adding extra clusters becomes a waste of time. After all, if this approach doesn’t become efficient, no one will use it.</p>



<p>But how do we make the Elbow Method work? How do we use it to identify our target number of clusters, our <em>k</em>? The easiest way to do that is to visualise our clusters and take a judgement from there, hence why we installed the ggplot2 package earlier.</p>



<h3 class="wp-block-heading" id="running-the-elbow-method-in-r">Running The Elbow Method In R</h3>



<p>Before we get started, I have to say one thing: R isn’t the fastest language in the world (hence why I’m moving away from it these days) and it may take quite a while to run this process if you’ve got a large dataset. If you do have a lot of keywords, make sure you’ve got something else to do, because otherwise, you might be looking at your screen for a while.</p>



<p>Firstly, we need to create an empty data frame to put our cluster information into. That’s easy, we’ll just use the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kFrame &lt;- data.frame()</pre>



<p>Now we have to use a <a href="https://www.datamentor.io/r-programming/for-loop/" target="_blank" rel="noreferrer noopener">for loop</a> to run the clustering algorithm and put it into that data frame. This is where the processing time comes in, and I know it’s not the cleanest way to run it in R, but it’s the way that I’ve found it to work the best.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for(i in 1:100){
  k &lt;- kmeans(corpusCleaned, centers = i, iter.max = 100)
  kFrame &lt;- rbind(kFrame, cbind(i, k$tot.withinss))
}</pre>



<p>This <a href="https://www.ben-johnston.co.uk/r-for-seo-part-7-loops/"  data-wpil-monitor-id="208">loop</a> will take our tidied-up corpus (our corpusCleaned variable) and use the <em>k</em>-means algorithm to break it out into as many relevant clusters as it can, up to 100 and then put that data into our empty data frame. Obviously we don’t want 100 clusters – no one’s going to work with that. What we want to find here is the break point, the number at which we get diminishing returns by adding new clusters.</p>



<p>It may take a while to run this if you’ve got quite a lot of keywords, but once it’s done, paste the following:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">names(kFrame) &lt;- c("cluster", "total")</pre>



<p>All we’re doing here is naming our column headers, but it’ll be important for our next stage of finding <em>k</em>.</p>



<h3 class="wp-block-heading" id="visualising-the-elbow-method-using-ggplot2">Visualising The Elbow Method Using GGPlot2</h3>



<p>As I said earlier, there’s still a certain amount of manual work involved in this keyword &amp; topic clustering, and a big chunk of that is around finding <em>k</em> and then identifying what the clusters actually are.</p>



<p>Fortunately, it’s not actually a <em>lot</em> of manual work and it will really help with your SEO and content targeting, so it’s really worth taking the time.</p>



<p>Use the following command to create a plot of your clusters:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ggplot(data = kFrame, aes(x=cluster, y=total, group=1))+ 
  theme_bw(base_family = "Arial")+ geom_line(colour = "darkgreen")+
  scale_x_continuous(breaks = seq(from=0, to=100,by=5))
</pre>



<p>Certain elements, such as the colour, font and your chosen dataset can be switched up, obviously.</p>



<p>Using my Search Console dataset, I get the following result:</p>



<figure class="wp-block-image size-large"><img decoding="async" width="865" height="556" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot.png" alt="Elbow method for keyword clustering in SEO" class="wp-image-2896" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot.png 865w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-300x193.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-150x96.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-768x494.png 768w" sizes="(max-width: 865px) 100vw, 865px" /></figure>



<h3 class="wp-block-heading" id="using-the-elbow-method-to-identify-the-optimal-number-of-clusters">Using The Elbow Method To Identify The Optimal Number Of Clusters</h3>



<p>Now we’ve got our graph, we need to use a bit of human intelligence to identify our number of clusters. It’s not perfect, and that’s why other models exist, but I hope this is a good start for you.</p>



<p>When we look at these charts with the Elbow Method, we’re looking for the point that the chart curves and drops down sharply. The point at which additional clusters become less useful. From looking at the chart below using my dataset, we can see that seven clusters is the point at which we should stop adding extra clusters.</p>



<figure class="wp-block-image size-large"><img decoding="async" width="865" height="556" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-Copy.png" alt="Kmeans clustering for SEO with optimal clusters identified" class="wp-image-2899" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-Copy.png 865w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-Copy-300x193.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-Copy-150x96.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/02/Rplot-Copy-768x494.png 768w" sizes="(max-width: 865px) 100vw, 865px" /></figure>



<p>Now we’ve identified <em>k</em> and broken our terms out into clusters, now we need to match it back to our original dataset and name our topics.</p>



<p>From the piece of analysis above, we can see that the optimal number of clusters on this dataset, our <em>k,</em> is seven. Now we need to run the following commands to match them back to our original dataset:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kmeans7 &lt;- kmeans(corpusCleaned, 7)</pre>



<p>This is fairly self-explanatory, I hope, but essentially what we’re doing here is creating a new variable called kmeans7 (change it to whatever you like), telling R to use its base kmeans command on our corpusCleaned variable and to use it on the number of clusters we’ve identified. You can obviously adapt this to whatever number of SEO keyword or topic clusters your analysis identified using your own datasets.</p>



<p>Finally, you’ll want to turn this into a clean data frame. You can do that like so:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">kwClusters &lt;- as.data.frame(cbind(queries$Query, kmeans7$cluster))

names(kwClusters) &lt;- c("Query", "Cluster")</pre>



<p>Now your original keywords are in a data frame with their assigned cluster in the next column.</p>



<p>Now your original keywords are in a data frame with their assigned cluster in the next column and the columns are named “Query” and “Cluster”, keeping them consistent with our main dataset.</p>



<p>From here, we’ll want to get those clusters assigned to our main dataset. Dplyr from the Tidyverse has a really handy left_join function which works a bit like Excel’s Index Match and will let you match this easily.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">queries &lt;- left_join(queries, kwClusters, by = "Query")</pre>



<p>Again, you’ll need to adapt your variables to fit your datasets, but this is how it’s working with my particular example</p>



<h2 class="wp-block-heading" id="what-can-you-do-with-this">What Can You Do With This?</h2>



<p>There’s a lot that can be done with this and today I’m updating this post with a few examples of how to do the different elements I’d just suggested previously.</p>



<h3 class="wp-block-heading" id="explore-your-clusters-by-subsetting">Explore Your Clusters By Subsetting</h3>



<p>The easiest way to get to grips with what’s contained in each cluster is to <a href="https://www.statmethods.net/management/subset.html" target="_blank" rel="noreferrer noopener">subset </a>and explore accordingly.</p>



<p>Subsetting is one of the most essential elements of working with large datasets, so it’s definitely worth getting to grips with. Fortunately, R has a number of base functions to let you do just that.</p>



<p>Let’s take a look at our first cluster in isolation:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">clusterOne &lt;- subset(queries, Cluster == 1)</pre>



<p>Here, we’ve cut down our dataset to only look at cluster one. When you’re subsetting or using other exact matches in R, you need to use the ==, otherwise things can get a bit skewey.</p>



<p>Let’s explore it a little. First, we want to see how many observations (keywords in this case), we have in this cluster. Nice and easy – in fact, in RStudio, you can just look in the Data pane like so:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="441" height="112" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/dataPane.png" alt="RStudio Data Pane" class="wp-image-2925" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/dataPane.png 441w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/dataPane-300x76.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/dataPane-150x38.png 150w" sizes="(max-width: 441px) 100vw, 441px" /></figure>



<p>But let’s do it with some code anyway. The nrow function in base R will do that for you.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">nrow(clusterOne)</pre>



<p>How about if we want to see the number of impressions and clicks from that cluster? We can use the <a href="https://dplyr.tidyverse.org/reference/summarise.html" target="_blank" rel="noreferrer noopener">Tidyverse’s summarise function</a> for that:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">clusterOne %>% summarise(Impressions = sum(Impressions), Clicks = sum(Clicks))</pre>



<p>This will give us an output of the total number of impressions and clicks from that cluster, which can be useful when identifying the opportunity available. Obviously, you can use this for other elements as well. Perhaps you’ve merged some search volume data in, for example, or you want to see what your average position is for this cluster.</p>



<p>Exploring your data in subsets will give you a much greater understanding of what each cluster contains, so it’s well worth doing.</p>



<h3 class="wp-block-heading" id="create-wordclouds-by-cluster">Create Wordclouds By Cluster</h3>



<p>Wordclouds are something that always tend to go over well in client presentations, and, although a lot of designers hate them, they’re often a fantastic way to see the most common keywords and terms in your dataset. By doing this by cluster, we’ve got a great way to dig into what each cluster is discussing.</p>



<p>The Wordcloud package for R has everything you need to do this.</p>



<p>Let’s take a look at the queries in my first cluster:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">wordcloud(clusterOne$Query, scale=c(5,0.5), max.words=250, random.order=FALSE, 
          rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8,"Dark2"))</pre>



<p>This will create a wordcloud from that keyword cluster’s query column, give it some pretty colours and present it in the RStudio plots pane. In this cluster’s case, the wordcloud looks like this:</p>



<figure class="wp-block-image size-large"><img decoding="async" width="865" height="556" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/Rplot01.png" alt="Keyword Cluster SEO Wordcloud" class="wp-image-2926" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/Rplot01.png 865w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/Rplot01-300x193.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/Rplot01-150x96.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/Rplot01-768x494.png 768w" sizes="(max-width: 865px) 100vw, 865px" /></figure>



<p>As you can see from here, cluster one of my Search Console data is all about <a href="https://www.ben-johnston.co.uk/tracking-email-google-analytics/">tracking email in Google Analytics</a>.</p>



<p>Wordclouds are a really quick and visual way to explore and identify the topics covered in your keyword clusters and using R instead of a separate tool will help you do that nice and easily without needing to leave RStudio.</p>



<h3 class="wp-block-heading" id="plotting-a-combo-chart-with-r-amp;-ggplot">Plotting A Combo Chart With R &amp; GGPlot</h3>



<p>The final idea I’m going to run through today is to plot a combo chart where we look at impressions and clicks by cluster. The analyst in me is not a big fan of combo charts since they’re rather flawed, but they are a great way to quickly identify opportunities to improve SEO performance in your keyword clusters. I’m also not very good at making R graphs look pretty, so sorry for that!</p>



<p>Again, you can adapt this to look at a wide variety of different metrics and using the <a href="https://ggplot2.tidyverse.org/" target="_blank" rel="noreferrer noopener">GGPlot package</a> from the Tidyverse gives you a lot of fun graphical avenues to explore.</p>



<p>Firstly, we want to create a dataframe containing a summary of the dataset. You can do that like so with a Dplyr function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">querySummary &lt;- queries %>% group_by(Cluster) %>%
  summarise(Impressions = sum(Impressions), Clicks = sum(Clicks), Avg.Position = mean(Position),
            Avg.CTR = mean(CTR))</pre>



<p>Now we have a frame which contains all our keyword clusters with the total impressions, total clicks and the average position and CTR in place, which will be useful for a wide variety of visualisations, not just this example.</p>



<p>The code below will create a column graph with the impressions by cluster and the clicks on a line chart with a secondary axis scaled by 20 to accommodate the differences between the two variables. You can absolutely feel free to change the colours to make it look better.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">ggplot(querySummary)+
  geom_col(aes(Cluster, Impressions), size = 1, colour = "black", fill = "darkgray")+
  geom_line(aes(Cluster, 20*Clicks), size = 1, colour = "darkgreen", group =1)+
  scale_y_continuous(sec.axis = sec_axis(~./20, name = "Clicks"))</pre>



<p>This will give the following graph:</p>



<figure class="wp-block-image size-large"><img decoding="async" width="865" height="556" src="https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/comboChart.png" alt="Keyword Clustering SEO Combo Chart GGPlot R" class="wp-image-2928" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/comboChart.png 865w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/comboChart-300x193.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/comboChart-150x96.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2021/04/comboChart-768x494.png 768w" sizes="(max-width: 865px) 100vw, 865px" /></figure>



<p>And there you have it – a really simple introduction to keyword clustering for SEO and some things you can do with it before you start creating your content accordingly.</p>



<h2 class="wp-block-heading" id="wrapping-up">Wrapping Up</h2>



<p>I really hope that this will help some of you that have been wanting to use keyword and topic clustering to better target your SEO but haven’t had the budget for the tools. Everything I’ve talked about here is free to use and, while you might need to do a bit of extra reading to fully get what’s going on, I really hope it’s been helpful.</p>



<p>The most important thing to do with this is to be ruthless. Keep your keyword and topic clusters small and investigate the classifications manually because the wider the base, the less you’ll do with it and the less effective it’ll be.</p>



<p>There is so much that can be done with this that one day, if the world ever goes back to normal, I’ll maybe do a talk on it somewhere, but the key thing to understand is that techniques like this don’t do the job for you, but they do help you do the job a hell of a lot better.</p>



<p>As usual, any questions, shoot me a line on <a href="https://twitter.com/ben_johnston80" target="_blank" rel="noreferrer noopener">Twitter </a>or through the <a href="https://www.ben-johnston.co.uk/get-in-touch/">contact form</a>.</p>



<p>I’ve put the full script below, amend as required.</p>



<p>Until next time.</p>



<h2 class="wp-block-heading" id="the-full-keyword-and-topic-clustering-for-seo-r-script">The Full Keyword And Topic Clustering For SEO R Script</h2>



<pre class="EnlighterJSRAW" data-enlighter-language="r" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">## Install Packages

options(warn=-1)

set.seed(12)

memory.limit(1600000000)

instPacks &lt;- c("tidyverse", "tm", "wordcloud")

lapply(instPacks, require, character.only = TRUE)

## Read Data

queries &lt;- read.csv("Queries.csv", stringsAsFactors = FALSE)

## Prepare Text

dtm &lt;- Corpus(VectorSource(queries$Query))

corpusClean &lt;- function(x){
  
  lowCase &lt;- tm_map(x, tolower)
  
  plainText &lt;- tm_map(lowCase, PlainTextDocument)
  
  remPunc &lt;- tm_map(lowCase, removePunctuation)
  
  stemDoc &lt;- tm_map(remPunc, stemDocument)
  
  output &lt;- DocumentTermMatrix(stemDoc)
  
}

corpusCleaned &lt;- corpusClean(dtm)

## Elbow Method To Find K

kFrame &lt;- data.frame()

for(i in 1:100){
  k &lt;- kmeans(corpusCleaned, centers = i, iter.max = 100)
  kFrame &lt;- rbind(kFrame, cbind(i, k$tot.withinss))
}

names(kFrame) &lt;- c("cluster", "total")

## Visualise Elbow To Find K

ggplot(data = kFrame, aes(x=cluster, y=total, group=1))+ 
  theme_bw(base_family = "Arial")+ geom_line(colour = "darkgreen")+
  scale_x_continuous(breaks = seq(from=0, to=100,by=5))

## Identify Clusters

kmeans7 &lt;- kmeans(corpusCleaned, 7)

kwClusters &lt;- as.data.frame(cbind(queries$Query, kmeans7$cluster))

names(kwClusters) &lt;- c("Query", "Cluster")

## Merge Clusters To Query Data

queries &lt;- left_join(queries, kwClusters, by = "Query")

## Explore Clusters With Subsets

clusterOne &lt;- subset(queries, Cluster == 1)

## Wordclouds

wordcloud(clusterOne$Query, scale=c(5,0.5), max.words=250, random.order=FALSE, 
          rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8,"Dark2"))

## Impressions/ Clicks By Cluster

querySummary &lt;- queries %>% group_by(Cluster) %>%
  summarise(Impressions = sum(Impressions), Clicks = sum(Clicks), Avg.Position = mean(Position),
            Avg.CTR = mean(CTR))

ggplot(querySummary)+
  geom_col(aes(Cluster, Impressions), size = 1, colour = "black", fill = "darkgray")+
  geom_line(aes(Cluster, 20*Clicks), size = 1, colour = "darkgreen", group =1)+
  scale_y_continuous(sec.axis = sec_axis(~./20, name = "Clicks"))
</pre>



<p></p>
<p><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_bluesky" href="https://www.addtoany.com/add_to/bluesky?linkurl=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&amp;linkname=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" title="Bluesky" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fwww.ben-johnston.co.uk%2Fkeyword-topic-clustering-seo-r%2F&#038;title=Keyword%20%26%20Topic%20Clustering%20For%20SEO%20With%20R" data-a2a-url="https://www.ben-johnston.co.uk/keyword-topic-clustering-seo-r/" data-a2a-title="Keyword &amp; Topic Clustering For SEO With R"></a></p><style>
.lwrp.link-whisper-related-posts{
            
            margin-top: 40px;
margin-bottom: 30px;
        }
        .lwrp .lwrp-title{
            
            
        }.lwrp .lwrp-description{
            
            

        }
        .lwrp .lwrp-list-container{
        }
        .lwrp .lwrp-list-multi-container{
            display: flex;
        }
        .lwrp .lwrp-list-double{
            width: 48%;
        }
        .lwrp .lwrp-list-triple{
            width: 32%;
        }
        .lwrp .lwrp-list-row-container{
            display: flex;
            justify-content: space-between;
        }
        .lwrp .lwrp-list-row-container .lwrp-list-item{
            width: calc(25% - 20px);
        }
        .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
            
            max-width: 150px;
        }
        .lwrp .lwrp-list-item img{
            max-width: 100%;
            height: auto;
            object-fit: cover;
            aspect-ratio: 1 / 1;
        }
        .lwrp .lwrp-list-item.lwrp-empty-list-item{
            background: initial !important;
        }
        .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
        .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
            
            
            
            
        }@media screen and (max-width: 480px) {
            .lwrp.link-whisper-related-posts{
                
                
            }
            .lwrp .lwrp-title{
                
                
            }.lwrp .lwrp-description{
                
                
            }
            .lwrp .lwrp-list-multi-container{
                flex-direction: column;
            }
            .lwrp .lwrp-list-multi-container ul.lwrp-list{
                margin-top: 0px;
                margin-bottom: 0px;
                padding-top: 0px;
                padding-bottom: 0px;
            }
            .lwrp .lwrp-list-double,
            .lwrp .lwrp-list-triple{
                width: 100%;
            }
            .lwrp .lwrp-list-row-container{
                justify-content: initial;
                flex-direction: column;
            }
            .lwrp .lwrp-list-row-container .lwrp-list-item{
                width: 100%;
            }
            .lwrp .lwrp-list-item:not(.lwrp-no-posts-message-item){
                
                max-width: initial;
            }
            .lwrp .lwrp-list-item .lwrp-list-link .lwrp-list-link-title-text,
            .lwrp .lwrp-list-item .lwrp-list-no-posts-message{
                
                
                
                
            };
        }</style>
<div id="link-whisper-related-posts-widget" class="link-whisper-related-posts lwrp">
            <h3 class="lwrp-title">Related Posts</h3>    
        <div class="lwrp-list-container">
                                            <div class="lwrp-list-multi-container">
                    <ul class="lwrp-list lwrp-list-double lwrp-list-left">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-3-data-visualisation-with-ggplot2-wordcloud/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png" class="attachment-480x480 size-480x480 wp-post-image" alt="" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/04/r-for-seo-part-3-visualisation-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 3: Data Visualisation With GGPlot2 &#038; Wordcloud</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-2-packages-google-analytics-search-console/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R for SEO Part 2: Packages" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/p2-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 2: Packages, Google Analytics &#038; Search Console With R</span></a></li>                    </ul>
                    <ul class="lwrp-list lwrp-list-double lwrp-list-right">
                        <li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/r-for-seo-part-1-basics/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png" class="attachment-480x480 size-480x480 wp-post-image" alt="R For SEO Part One | Ben Johnston" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2022/02/r-for-seo-p1-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">R For SEO Part 1: The Basics</span></a></li><li class="lwrp-list-item"><a href="https://www.ben-johnston.co.uk/sentiment-analysis-for-seo-using-google-sheets/" class="lwrp-list-link"><img width="480" height="230" src="https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets.png" class="attachment-480x480 size-480x480 wp-post-image" alt="sentiment analysis for SEO with Google Sheets" srcset="https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets.png 951w, https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets-300x144.png 300w, https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets-150x72.png 150w, https://www.ben-johnston.co.uk/wp-content/uploads/2020/06/sentiment-analysis-seo-google-sheets-768x367.png 768w" sizes="(max-width: 480px) 100vw, 480px" /><br><span class="lwrp-list-link-title-text">Sentiment Analysis For SEO Using Google Sheets</span></a></li>                    </ul>
                </div>
                        </div>
</div><p>This post was written by Ben Johnston on <a href="https://www.ben-johnston.co.uk">Ben Johnston</a></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
