Below is what happened in search today, as reported on Search Engine Land and from other places across the web. The post SearchCap: Google images, mobile shopping & election hub appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/searchcap-google-images-mobile-shopping-election-hub-258869
0 Comments
You are now 50% less likely to see the image box show up in your Google search results page. Is it a bug or a feature? The post Google drastically reduces how often the image box shows up in the web search results appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/google-drastically-reduces-often-image-box-shows-web-search-results-258846
As we edge closer to the holidays, it's time to make the most of your Google Shopping campaigns. Columnist David Rekuc reveals seven tips to help you bring your shopping campaigns up a notch. The post 7 advanced tips for Google Shopping ads appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/7-advanced-tips-google-shopping-ads-258660
Bidding on your competitors' brand terms could yield positive results, but columnist Jacob Baadsgaard warns that it might cause problems in the long run. The post Bidding on the competition: Is it really worth it? appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/bidding-competition-really-worth-258398
Advertisers have seen many big updates from Google over the past year, but columnist Andy Taylor makes the case that the most impactful updates may well have been the least publicized. The post The biggest Google ad updates are also the quietest appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/biggest-google-ad-updates-also-quietest-258423
Survey says nearly 90 percent of mobile consumers turn to search first. The post Google: Search the primary and most often used mobile shopping tool appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/google-search-primary-often-used-mobile-shopping-tool-258649
Google says election-related searches for this election cycle are up 240% over 2012. The post Google Trends Election Hub offers deep dive into search trends for 2016 candidates & political issues appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/google-trends-election-hub-offers-deep-dive-search-trends-2016-candidates-political-issues-258784 Posted by petewailes Progressive Web Apps. Ah yes, those things that Google would have you believe are a combination of Ghandi and Dumbledore, come to save the world from the terror that is the Painfully Slow WebsiteTM. But what actually makes a PWA? Should you have one? And if you create one, how will you make sure it ranks? Well, read on to find out... What's a PWA?Given as that Google came up with the term, I thought we'd kick off with their definition: "A Progressive Web App uses modern web capabilities to deliver an app-like user experience." The really exciting thing about PWAs: they could make app development less necessary. Your mobile website becomes your app. Speaking to some of my colleagues at Builtvisible, this seemed to be a point of interesting discussion: do brands need an app and a website, or a PWA? Fleshing this out a little, this means we'd expect things like push notifications, background sync, the site/app working offline, having a certain look/design to feel like a native application, and being able to be set on the device home screen. These are things we traditionally haven't had available to us on the web. But thanks to new browsers supporting more and more of the HTML5 spec and advances in JavaScript, we can start to create some of this functionality. On the whole, Progressive Web Apps are:
It's worth taking a moment to unpack the "app-like" part of that. Fundamentally, there are two parts to a PWA: service workers (which we'll come to in a minute), and application shell architecture. Google defines this as: ...the minimal HTML, CSS, and JavaScript powering a user interface. The application shell should: This method of loading content allows for incredibly fast perceived speed. We are able to get something that looks like our site in front of a user almost instantly, just without any content. The page will then go and fetch the content and all's well. Obviously, if we actually did things this way in the real world, we'd run in to SEO issues pretty quickly, but we'll address that later too. If then, at their core, a Progressive Web App is just a website served in a clever way with extra features for loading stuff, why would we want one? The use caseLet me be clear before I get into this: for most people, a PWA is something you don't need. That's important enough that it bares repeating, so I'll repeat it: You probably don't need a PWA. The reason for this is that most websites don't need to be able to behave like an app. This isn't to say that there's no benefit to having the things that PWA functionality can bring, but for many sites, the benefits don't outweigh the time it takes to implement the functionality at the moment. When should you look at a PWA then? Well, let's look at a checklist of things that may indicate that you do need one... Signs a PWA may be appropriateYou have:
In short, you have something beyond a normal website, with interactive or time-sensitive components, or rapidly released or updated content. A good example is the Google Weather PWA: If you're running a normal site, with a blog that maybe updates every day or two, or even less frequently, then whilst it might be nice to have a site that acts as a PWA, there's probably more useful things you can be doing with your time for your business. How they workSo, you have something that would benefit from this sort of functionality, but need to know how these things work. Welcome to the wonder that is the service worker. Service workers can be thought of as a proxy that sits between your website and the browser. It calls for intercept of things you ask the browser to do, and hijacking of the responses given back. That means we can do things like, for example, hold a copy of data requested, so when it's asked for again, we can serve it straight back (this is called caching). This means we can fetch data once, then replay it a thousand times without having to fetch it again. Think of it like a musician recording an album — it means they don't have to play a concert every time you want to listen to their music. Same thing, but with network data. If you want a more thorough explanation of service workers, check out this moderately technical talk given by Jake Archibald from Google. What service workers can doService workers fundamentally exist to deliver extra features, which have not been available to browsers until now. These includes things like:
It's planned that in the future, they'll be able to do even more than they currently can. For now though, these are the sorts of features you'll be able to make use of. Obviously these mostly load data via AJAX, once the app is already loaded. What are the SEO implications?So you're sold on Progressive Web Apps. But if you create one, how will you make sure it ranks? As with any new front-end technology, there are always implications for your SEO visibility. But don't panic; the potential issues you'll encounter with a PWA have been solved before by SEOs who have worked on JavaScript-heavy websites. For a primer on that, take a look at this article on JS SEO. There are a few issues you may encounter if you're going to have a site that makes use of application shell architecture. Firstly, it's pretty much required that you're going to be using some form of JS framework or view library, like Angular or React. If this is the case, you're going to want to take a look at some Angular.JS or React SEO advice. If you're using something else, the short version is you'll need to be pre-rendering pages on the server, then picking up with your application when it's loaded. This enables you to have all the good things these tools give you, whilst also serving something Google et al can understand. Despite their recent advice that they're getting good at rendering this sort of application, we still see plenty of examples in the wild of them flailing horribly when they crawl heavy JS stuff. Assuming you're in the world of clever JS front-end technologies, to make sure you do things the PWA way, you'll also need to be delivering the CSS and JS required to make the page work along with the HTML. Not just including Obviously, this means you're going to increase the size of the page you're sending down the wire, but it has the upside of meaning that the page will load instantly. More than that, though, with all the JS (required for pick-up) and CSS (required to make sense of the design) delivered immediately, the browser will be able to render your content and deliver something that looks correct and works straightaway. Again, as we're going to be using service workers to cache content once it's arrived, this shouldn't have too much of an impact. We can also cache all the CSS and JS external files required separately, and load them from the cache store rather than fetching them every time. This does make it very slightly more likely that the PWA will fail on the first time that a user tries to request your site, but you can still handle this case gracefully with an error message or default content, and re-try on the next page view. There are other potential issues people can run in to, as well. The Washington Post, for example, built a PWA version of their site, but it only works on a mobile device. Obviously, that means the site can be crawled nicely by Google's mobile bots, but not the desktop ones. It's important to respect the P part of the acronym — the website should enable features that a user can make use of, but still work in a normal manner for those who are using browsers that don't support them. It's about enhancing functionality progressively, not demanding that people upgrade their browser. The only slightly tricky thing with all of this is that it requires that, for best experience, you design your application for offline-first experiences. How that's done is referenced in Jake's talk above. The only issue with going down that route: you're only serving content once someone's arrived at your site and waited long enough to load everything. Obviously, in the case of Google, that's not going to work well. So here's what I'd suggest... Rather than just sending your application shell, and then using AJAX to request content on load, and then picking up, use this workflow instead:
Adding in the data required means that, on load, we don't have to make an AJAX call to get the initial data required. Instead, we can bundle that in too, so we get something that can render content instantly as well. As an example of this, let's think of a weather app. Now, the basic model would be that we send the user all the content to show a basic version of our app, but not the data to say what the weather is. In this modified version, we also send along what today's weather is, but for any subsequent data request, we then go to the server with an AJAX call. This means we still deliver content that Google et al can index, without possible issues from our AJAX calls failing. From Google and the user's perspective, we're just delivering a very high-performance initial load, then registering service workers to give faster experiences for every subsequent page and possibly extra functionality. In the case of a weather app, that might mean pre-fetching tomorrow's weather each day at midnight, or notifying the user if it's going to rain, for example. Going furtherIf you're interested in learning more about PWAs, I highly recommend reading this guide to PWAs by Addy Osmani (a Google Chrome engineer), and then putting together a very basic working example, like the train one Jake mentions in his YouTube talk referenced earlier. If you're interested in that, I recommend Jake's Udacity course on creating a PWA available here. Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! via The Moz Blog http://tracking.feedpress.it/link/9375/4415486
Below is what happened in search today, as reported on Search Engine Land and from other places across the web. The post SearchCap: Google extended text ads, Bing Ads structured snippets & Doodle 4 Google appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/searchcap-google-extended-text-ads-bing-ads-structured-snippets-doodle-4-google-258775
Students in grades K through 12 have until December 2 to submit artwork based on the theme: "What I see for the future." The post 2016 Doodle 4 Google contest asks students to look to the future appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/2016-doodle-4-google-contest-asks-students-look-future-258753
Columnist Christi Olson describes the changing face of search engine results pages and how new technology and functionality is benefiting both searchers and advertisers. The post Search takes to the cloud as users engage with new actions appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/search-takes-cloud-users-engage-new-actions-258385
The extension is rolling out globally. The post Bing Ads launches Structured Snippets for text ads appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/bing-ads-launches-structured-snippets-text-ads-258740
Columnist Laura Collins lays out five mistakes she commonly sees in paid search and explains how to avoid them. The post Common PPC mistakes and how to avoid them appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/common-ppc-mistakes-avoid-258198
How do you create great content that captures organic search and social traffic alike? Columnist Matthew Barby explains his method for identifying content themes and topics. The post Using social metrics + SEO + questions to create content that drives inbound traffic appeared first on Search...
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/using-social-metrics-seo-questions-create-content-drives-inbound-traffic-255373 Posted by luciamarin This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author's views are entirely his or her own and may not reflect the views of Moz, Inc. In this article, we’re going to learn how to create the rel canonical URL tag using Google Tag Manager, and how to insert it in every page of our website so that the correct canonical is automatically generated in each URL. We’ll do it using Google Tag Manager and its variables. Why send a canonical from each page to itself?
Javier Lorente gave us a very good explanation/reminder at the 2015 SEO Salad event in Zaragoza (Spain). In short, there may be various factors that cause Google to index unexpected variants of a URL, and this is often beyond our control:
By including this “standard” canonical in every URL, we are making it easy for Google to identify the original content. How do we generate the dynamic value of the canonical URL?
To generate the canonical URL, dynamically we need to force it to always correspond to the “clean" (i.e., absolute, unique, and simplified) URL of each page (taking into account the www, URL query string parameters, anchors, etc.). Remember that, in summary, the URL variables that can be created in GTM (Google Tag Manager) correspond to the following components: We want to create a unique URL for each page, without queries or anchors. We need a “clean” URL variable, and we can’t use the {{Page URL}} built-in variable, for two reasons:
Therefore, we need to combine Protocol + Host + Path into a single variable. Now, let's take a step-by-step look at how to create our {{Page URL Canonical}} variable. 1. Create {{Page Protocol}} to compile the section of the URL according to whether it’s an http:// or https://Note: We’re assuming that the entire website will always function under a single protocol. If that’s not the case, then we should substitute the {{Page Protocol}} variable for plain text in the final variable of Step #4. (This will allow us to force it to always be http/https, without exception.) 2. Create {{Page Hostname Canonical}}We need a variable in which the hostname is always unique, whether or not it’s entered into the browser with the www. The hostname canonical must always be the same, regardless of whether or not it has the www. We can decide based on which one of the domains is redirected to the other, and then keep the original as the canonical. How do we create the canonical domain?
3. Enable the {{Page Path}} built-in variable
Note: Although we have the {{Page Hostname}} built-in variable, for this exercise it’s preferable not to use it, as we’re not 100% sure how it will behave in relation to the www (e.g., in this instance, it’s not configurable, unlike when we create it as a GTM custom variable). 4. Create {{Page URL Canonical}}Link the three previous variables to form a constant variable: {{Page Protocol}}://{{Page Hostname Canonical}}{{Page Path}} Summary/Important notes:
Now that we have created {{Page URL Canonical}}, we could even populate it into Google Analytics via custom dimensions. You can learn to do that in this Google Analytics custom dimensions guide. How can we insert the canonical into a page using Tag Manager?Let’s suppose we’ve already got a canonical URL generated dynamically via GTM: {{Page URL Canonical}}. Now, we need to look at how to insert it into the page using a GTM tag. We should emphasize that this is NOT the “ideal” solution, as it’s always preferable to insert the tag into the <head> of the source code. But, we have confirming evidence from various sources that it DOES work if it’s inserted via GTM. And, as we all know, in most companies, the ideal doesn’t always coincide with the possible! If we could insert content directly into the <head> via GTM, it would be sufficient to use the following custom HTML tag: <link href=”{{Page URL Canonical}}” /> But, we know that this won’t work because the inserted content in HTML tags usually goes at the end of the </body>, meaning Google won’t accept or read a <link rel="canonical"> tag there. So then, how do we do it? We can use JavaScript code to generate the tag and insert it into the <head>, as described in this article, but in a form that has been adapted for the canonical tag: <script> var c = document.createElement('link'); c.; c.href = {{Page URL Canonical}}; document.head.appendChild(c); </script> And then, we can set it to fire on the “All Pages” trigger. Seems almost too easy, doesn’t it? How do we check whether our rel canonical is working?Very simple: Check whether the code is generated correctly on the page. How do we do that? By looking at the DevTools Console in Chrome, or by using a browser plugin like like Firebug that returns the code generated on the page in the DOM (document object model). We won't find it in the source code (Ctrl+U). Here’s how to do this step-by-step:
That's it. Easy-peasy, right? So, what are your thoughts? Do you also use Google Tag Manager to improve your SEO? Why don’t you give us some examples of when it’s been useful (or not)? Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! via The Moz Blog http://tracking.feedpress.it/link/9375/4406917
Issues a set of dos and dont's for testing expanding text ads. The post Google bumps Expanded Text Ad deadline to January 31, 2017 appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/google-bumps-expanded-text-ad-deadline-january-31-2017-258710
Below is what happened in search today, as reported on Search Engine Land and from other places across the web. The post SearchCap: Google AdWords login, keyword planner bug & Bing Ads bulk editing appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/searchcap-google-adwords-login-keyword-planner-bug-bing-ads-bulk-editing-258704
Hints of more improvements to come. The post Bing Ads apps keep getting more useful, now with bulk editing appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/bing-ads-app-bulk-enable-pause-258692
In two weeks, SEOs and SEMs will gather at largest search marketing conference on the East Coast: SMX East. Don’t miss your chance to get the latest SEO and SEM tactics and connect with other search marketers. Here’s what you’ll get: Tactics you can use immediately: SMX East offers 50+ sessions...
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/two-weeks-smx-east-register-now-258669
Columnist Julie Joyce goes through the biggest red flags for a client and link provider relationship so you each know when it's time to say goodbye. The post When to end your client/link provider relationship appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/end-clientlink-provider-relationship-258189
Columnist Eric Enge explains how investing in long-term marketing initiatives can give you a huge advantage over your competition. The post Sustainable competitive advantages in digital marketing appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/sustainable-competitive-advantages-digital-marketing-258100
Also, no need to confirm user access invitations anymore. The post New: Access up to 5 AdWords accounts with one login appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/access-5-adwords-accounts-one-login-258655
Many users with active campaigns are unable to access Keyword Planner. The post Seeing Google Keyword Planner down? You’re not alone appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/google-keyword-planner-down-258643
Born in the mountains of Peru, Sumac won worldwide recognition for her five-octave vocal range. The post Yma Sumac Google doodle celebrates the “Peruvian Songbird” soprano appeared first on Search Engine Land.
Please visit Search Engine Land for the full article. via Search Engine Land: News & Info About SEO, PPC, SEM, Search Engines & Search Marketing http://searchengineland.com/yma-sumac-google-doodle-celebrates-peruvian-songbird-soprano-258628 Posted by Todd_McDonald A steady rise in content-related marketing disciplines and an increasing connection between effective SEO and content has made the benefits of harnessing strategic content clearer than ever. However, success isn't always easy. It's often quite difficult, as I’m sure many of you know. A number of challenges must be overcome for success to be realized from end-to-end, and finding quick ways to keep your content ideas fresh and relevant is invaluable. To help with this facet of developing strategic content, I’ve laid out a process below that shows how a few SEO tools and a little creativity can help you identify content ideas based on actual conversations your audience is having online. What you’ll needScreaming Frog: The first thing you’ll need is a copy of Screaming Frog (SF) and a license. Fortunately, it isn’t expensive (around $150/USD for a year) and there are a number of tutorials if you aren’t familiar with the program. After you’ve downloaded and set it up, you’re ready to get to work. Google AdWords Account: Most of you will have access to an AdWords account due to actually running ads through it. If you aren’t active with the AdWords system, you can still create an account and use the tools for free, although the process has gotten more annoying over the years. Excel/Google Drive (Sheets): Either one will do. You'll need something to work with the data outside of SF. Browser: We walk through the examples below utilizing Chrome. The conceptOne way to gather ideas for content is to aggregate data on what your target audience is talking about. There are a number of ways to do this, including utilizing search data, but it lags behind real-time social discussions, and the various tools we have at our disposal as SEOs rarely show the full picture without A LOT of monkey business. In some situations, determining intent can be tricky and require further digging and research. On the flipside, gathering information on social conversations isn’t necessarily that quick either (Twitter threads, Facebook discussion, etc.), and many tools that have been built to enhance this process are cost-prohibitive. But what if you could efficiently uncover hundreds of specific topics, long-tail queries, questions, and more that your audience is talking about, and you could do it in around 20 minutes of focused work? That would be sweet, right? Well, it can be done by using SF to crawl discussions that your audience is having online in forums, on blogs, Q&A sites, and more. Still here? Good, let’s do this. The processStep 1 – Identifying targetsThe first thing you’ll need to do is identify locations where your ideal audience is discussing topics related to your industry. While you may already have a good sense of where these places are, expanding your list or identifying sites that match well with specific segments of your audience can be very valuable. In order to complete this task, I'll utilize Google’s Display Planner. For the purposes of this article, I'll walk through this process for a pretend content-driven site in the Home and Garden vertical. Please note, searches within Google or other search engines can also be a helpful part of this process, especially if you're familiar with advanced operators and can identify platforms with obvious signatures that sites in your vertical often use for community areas. WordPress and vBulletin are examples of that. Google’s Display PlannerBefore getting started, I want to note I won’t be going deep on how to use the Display Planner for the sake of time, and because there are a number of resources covering the topic. I highly suggest some background reading if you’re not familiar with it, or at least do some brief hands-on experimenting. I’ll start by looking for options in Google’s Display Planner by entering keywords related to my website and the topics of interest to my audience. I’ll use the single word “gardening.” In the screenshot below, I’ve selected “individual targeting ideas” from the menu mid-page, and then “sites.” This allows me to see specific sites the system believes match well with my targeting parameters. I'll then select a top result to see a variety of information tied to the site, including demographics and main topics. Notice that I could refine my search results further by utilizing the filters on the left side of the screen under “Campaign Targeting.” For now, I'm happy with my results and won’t bother adjusting these. Step 2 – Setting up Screaming FrogNext, I'll take the website URL and open it in Chrome. Once on the site, I need to first confirm that there's a portion of the site where discussion is taking place. Typically, you’ll be looking for forums, message boards, comment sections on articles or blog posts, etc. Essentially, any place where users are interacting can work, depending on your goals. In this case, I'm in luck. My first target has a “Gardening Questions” section that's essentially a message board. A quick look at a few of the thread names shows a variety of questions being asked and a good number of threads to work with. The specific parameters around this are up to you — just a simple judgment call. Now for the fun part — time to fire up Screaming Frog! I’ll utilize the “Custom Extraction” feature found here: Configuration → Custom → Extraction ...within SF (you can find more details and broader use-case documentation set for this feature here). Utilizing Custom Extraction will allow me to grab specific text (or other elements) off of a set of pages. Configuring extraction parametersI'll start by configuring the extraction parameters. In this shot I've opened the custom extraction settings and have set the first extractor to XPath. I need multiple extractors set up, because multiple thread titles on the same URL need to be grabbed. You can simply cut and paste the code into the next extractors — but be sure to update the number sequence (outlined in orange) at the end to avoid grabbing the same information over and over. Notice as well, I've set the extraction type to “extract text.” This is typically the cleanest way to grab the information needed, although experimentation with the other options may be required if you’re having trouble getting the data you need. Tip: As you work on this, you might find you need to grab different parts of the HTML than what you thought. This process of getting things dialed can take some trial-and-error (more on this below). Grabbing Xpath codeTo grab the actual extraction code we need (visible in the middle box above):
Make sure you see the text you want highlighted in the code view, then right-click and select “XPath” (you can use other options, but I recommend reviewing the SF documentation mentioned above first). It’s worth noting that many times, when you're trying to grab the XPath for the text you want, you’ll actually need to select the HTML element one level above the text selected in the front-end view of the website (step three above). At this point, it’s not a bad idea to run a very brief test crawl to make sure the desired information is being pulled. To do this:
Resolving extraction issues & controlling the crawl
Everything looks good in my example, on the surface. What you’ll likely notice, however, is that there are other URLs listed without extraction text. This can happen when the code is slightly different on certain pages, or SF moves on to other site sections. I have a few options to resolve this issue:
In this situation, I'm going to exclude the pages I can’t pull information from based on my current settings and lock SF into the content we want. This may be another point of experimentation, but it doesn’t take much experience for you to get a feel for the direction you’ll want to go if the problem arises. In order to lock SF to URLs I would like data from, I’ll use the “include” and “exclude” options under the “configuration” menu item. I’ll start with include options. Here, I can configure SF to only crawl specific URLs on the site using regex. In this case, what’s needed is fairly simple — I just want to include anything in the /questions/ subfolder, which is where I originally found the content I want to scrape. One parameter is all that’s required, and it happens to match the example given within SF ☺:
The “excludes” are where things get slightly (but only slightly) trickier. During the initial crawl, I took note of a number of URLs that SF was not extracting information from. In this instance, these pages are neatly tucked into various subfolders. This makes exclusion easy as long as I can find and appropriately define them. In order to cut these folders out, I’ll add the following lines to the exclude filter:
Upon further testing, I discovered I needed to exclude the following folders as well:
It’s worth noting that you don’t HAVE to work through this part of configuring SF to get the data you want. If SF is let loose, it will crawl everything within the start folder, which would also include the data I want. The refinements above are far more efficient from a crawl perspective and also lessen the chance I'll be a pest to the site. It’s good to play nice. Completed crawl & extraction exampleHere’s how things look now that I've got the crawl dialed: Now I'm 99.9% good to go! The last crawl configuration is to reduce speed to avoid negatively impacting the website (or getting throttled). This can easily be done by going to Configuration → Speed and reducing the number of threads and URIs that can be crawled. I usually stick with something at or under 5 threads and 2 URIs. Step 3 – Ideas for analyzing dataAfter the end goal is reached (run time, URIs crawled, etc.) it’s time to stop the crawl and move on to data analysis. There a number of ways to start breaking apart the information grabbed that can be helpful, but for now I'll walk through one approach with a couple of variations. Identifying popular words and phrasesMy objective is to help generate content ideas and identify words and phrases that my target audience is using in a social setting. To do that, I’ll use a couple of simple tools to help me break apart my information: The top two URLs perform text analysis, with some of you possibly already familiar with the basic word-cloud generating abilities of tagcrowd.com. Online-Utility won’t pump out pretty visuals, but it provides a helpful breakout of common 2- to 8-word phrases, as well as occurrence counts on individual words. There are many tools that perform these functions; find the ones you like best if these don’t work! I’ll start with Tagcrowd.com. Utilizing Tagcrowd for analysisThe first thing I need to do is export a .csv of the data scraped from SF and combine all the extractor data columns into one. I can then remove blank rows, and after that scrub my data a little. Typically, I remove things like:
Now that I've got a clean data set free of extra characters and odd spaces, I'll copy the column and paste it into a plain text editor to remove formatting. I often use the one online at editpad.org. That leaves me with this: In Editpad, you can easily copy your clean data and paste it into the entry box on Tagcrowd. Once you’ve done that, hit visualize and you’re there. Tagcrowd.com There are a few settings down below that can be edited in Tagcrowd, such as minimum word occurrence, similar word grouping, etc. I typically utilize a minimum word occurrence of 2, so that I have some level of frequency and cut out clutter, which I’ve used for this example. You may set a higher threshold depending on how many words you want to look at. For my example, I've highlighted a few items in the cloud that are somewhat informational. Clearly, there’s a fair amount of discussion around “flowers,” seeds,” and the words “identify" and “ID.” While I have no doubt my gardening sample site is already discussing most of these major topics such as flowers, seeds, and trees, perhaps they haven’t realized how common questions are around identification. This one item could lead to a world of new content ideas. In my example, I didn’t crawl my sample site very deeply and thus my data was fairly limited. Deeper crawling will yield more interesting results, and you’ve likely realized already how in this example, crawling during various seasons could highlight topics and issues that are currently important to gardeners. It’s also interesting that the word “please” shows up. Many would probably ignore this, but to me, it’s likely a subtle signal about the communication style of the target market I'm dealing with. This is polite and friendly language that I'm willing to bet would not show up on message boards and forums in many other verticals ☺. Often, the greatest insights besides understanding popular topics from this type of study are related to a better understanding of communication style, phrasing, and more that your audience uses. All of this information can help you craft your strategy for connection, content, and outreach. Utilizing Online-Utility.org for analysisSince I've already scrubbed and prepared my data for Tagcrowd, I can paste it into the Online-Utility entry box and hit “process text.” After doing this, we ended up with this output: There’s more information available, but for the sake of space, I've grabbed only a couple of shots to give you the idea of most of what you’ll see. Notice in the first image, the phrases “identify this plant” & “what is this” both show up multiple times in the content I grabbed, further supporting the likelihood that content developed around plant identification is a good idea and something that seems to be in demand. Utilizing Excel for analysisLet’s take a quick look at one other method for analyzing my data. One of the simplest ways to digest the information is in Excel. After scrubbing the data and combining it into one column, a simple A→Z sort, puts the information in a format that helps bring patterns to light. Here, I can see a list of specific questions ripe for content development! This type of information, combined with data from tools such as keywordtool.io, can help identify and capture long-tail search traffic and topics of interest that would otherwise be hidden. Tip: Extracting information this way sets you up for very simple promotion opportunities. If you build great content that answers one of these questions, go share it back at the site you crawled! There’s nothing spammy about providing a good answer with a link to more information if the content you’ve developed is truly an asset. It’s also worth noting that since this site was discovered through the Display Planner, I already have demographic information on the folks who are likely posting these questions. I could also do more research on who is interested in this brand (and likely posting this type of content) utilizing the powerful ad tools at Facebook. This information allows me to quickly connect demographics with content ideas and keywords. While intent has proven to be very powerful and will sometimes outweigh misaligned messaging, it’s always great to know as much about who you're talking to and be able to cater messaging to them. Wrapping it upThis is just the beginning and it’s important to understand that. The real power of this process lies in its usage of simple, affordable, tools to gain information efficiently — making it accessible to many on your team, and an easy sell to those that hold the purse strings no matter your organization size. This process is affordable for mid-size and small businesses, and is far less likely to result in waiting on larger purchases for those at the enterprise level. What information is gathered and how it is analyzed can vary wildly, even within my stated objective of generating content ideas. All of it can be right. The variations on this method are numerous and allow for creative problem solvers and thinkers to easily gather data that can bring them great insight into their audiences’ wants, needs, psychographics, demographics, and more. Be creative and happy crawling! Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! via The Moz Blog http://tracking.feedpress.it/link/9375/4399084 |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
October 2016
Categories |