Advance Selenium

How to Find Broken Links Using Selenium Webdriver

Hi everyone welcomes back once again to Selenium webdriver tutorial. Today we are going to cover the very basic check that is a must in every application check. We will find broken links using selenium and how we can check what is the status of the same.

 

What is  find broken links using selenium

By the name itself, we can identify that we need to  find broken links using selenium it means we need to check the link which is pointing to wrong URL or invalid URL.

 

find broken links using selenium

 

 

I am sure you must have faced 404 page not found an issue in most of the application which is called broken link.

It does not only link you may also have to verify the images as well that we will see in the next tutorial.

While doing validation you only have to verify status

1- 200- Success- ok

 

Scenario for  find broken links using selenium

Before jumping to the code let’s take one simple example to get the actual concept.

Example1- Suppose we have one application which contains 400 links and we need to verify the link is broken or not.

 

Approach 1- 

Manual Process- Go to each link and verify the link is working or not.

Do you think it is the valid approach? No, it will take a full day to verify and you will not get the same efficiency and interest as well.

 

Approach 2-

Smart work- Write a code which will check all the link and will verify the status as well.

Since we all are smart so we will take Smart work and will see how to find broken links using selenium.

Before I start let me introduce HttpURLConnection class which will help us to verify the status of the response.

Precondition- 

1- Selenium setup  should be completed

 

Refer complete video find  broken links using selenium

 

Program for find broken links using selenium

 

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class VerifyLinks {

	public static void main(String[] args) 
	{
		WebDriver driver=new FirefoxDriver();
		
		driver.manage().window().maximize();
		
		driver.get("http://www.google.co.in/");
		
		List<WebElement> links=driver.findElements(By.tagName("a"));
		
		System.out.println("Total links are "+links.size());
		
		for(int i=0;i<links.size();i++)
		{
			
			WebElement ele= links.get(i);
			
			String url=ele.getAttribute("href");
			
			verifyLinkActive(url);
			
		}
		
	}
	
	public static void verifyLinkActive(String linkUrl)
	{
        try 
        {
           URL url = new URL(linkUrl);
           
           HttpURLConnection httpURLConnect=(HttpURLConnection)url.openConnection();
           
           httpURLConnect.setConnectTimeout(3000);
           
           httpURLConnect.connect();
           
           if(httpURLConnect.getResponseCode()==200)
           {
               System.out.println(linkUrl+" - "+httpURLConnect.getResponseMessage());
            }
          if(httpURLConnect.getResponseCode()==HttpURLConnection.HTTP_NOT_FOUND)  
           {
               System.out.println(linkUrl+" - "+httpURLConnect.getResponseMessage() + " - "+ HttpURLConnection.HTTP_NOT_FOUND);
            }
        } catch (Exception e) {
           
        }
    } 
	
	
	 

}

 

Output

You can see we are getting 49 links and all seems OK.

Congrats 🙂

find broken links using selenium

 

 

I hope you can understand how easy and how important to verify the link and images. You can try the above program and let me know if any issue in above program.

 

author-avatar

About Mukesh Otwani

I am Mukesh Otwani working professional in a beautiful city Bangalore India. I completed by BE from RGPV university Bhopal. I have passion towards automation testing since couple of years I started with Selenium then I got chance to work with other tools like Maven, Ant, Git, GitHub, Jenkins, Sikuli, Selenium Builder etc.

77 thoughts on “How to Find Broken Links Using Selenium Webdriver

  1. Manjusha says:

    how to perform automation when links are in excel file and I want to test broken links

    1. Hi Manjusha,

      Iterate in excel to read all links and add them into some data structure(like List). Then check for each URL using http library.

  2. ahmed nur says:

    Do you have this in c#

    1. Hi Ahmed,

      Not as of now but I’ll post it soon…:)

  3. Collin Code says:

    Any way you have a demonstration of this done in python?

    1. Hi Colin,

      Not yet, but I’m planning to post it soon…:)

  4. Mathi says:

    brother i need to create a app for find a number of broken link in a page using ember . How i do that??

    1. Hi Mathi,

      Apologies, I never worked on ember…:(

  5. Neha says:

    Hi,

    I am new into coding.. Why we used try and catch to validate the response code?

  6. Karthik says:

    I am getting unknownhostexception when running this program

    1. Hi Karthik,

      Are you running this code behind some proxy or firewall? Also check whether it is http or htpp secured url.

  7. Prakash chandra Khulbey says:

    Hello Mukesh
    Nice blog…..

    I have a query on above, can i use the same code while automating Mobile Web page or i need to change somthing in it.

    I have used same it is not executing while getting “href” in below line:
    String url=ele.getAttribute(“href”);

    Kindly help me.

    1. Hi Prakash,

      On mobile web browser also, you can use same code.

  8. Lilia says:

    Hello,
    I am having total links are 54
    and only 4 links are printed in the console. Why is that?

    INFO: Detected dialect: OSS
    Total links are 54
    https://www.kaffekapslen.dk/til-nespresso.html – OK
    https://www.kaffekapslen.dk/til-dolce-gusto.html – OK
    https://www.kaffekapslen.dk/til-tassimo.html – OK
    https://www.kaffekapslen.dk/kundeservice-og-kontaktoplysninger – OK
    [main] INFO net.serenitybdd.core.Serenity –

    1. Hi Lilia,

      Can you please mention what xpath have you used ?

  9. Rajnish says:

    Hi Mukesh,
    How to verify the other page links give errors 404 in the website. because this method only verify the few links but others are not found.

    1. Hi Rajnish,

      Using Selenium and href attribute, you can only fetch current webpage links. If you want to check links available on other page then you have to navigate to intended page.

  10. Girija says:

    Hello Mukesh,
    This method is working when broken image gives 404 error page. But what should we do when a website have managed 404 page to standardised error page ?

    1. Hi Girija,

      In your case, you may need to use GET/POST request using same HttpURLConnection class followed by verification of response of request.

  11. Rohit says:

    Hi Mukesh,

    Can you please make detail video of Web Services testing(Soupui and other related concepts) ? this will be very helpful.
    Thanks in advance.

    1. Hi Rohit,

      I’ll post corresponding videos soon…:)

  12. Tauseef says:

    Hi Mukesh,

    How can we generate report have these columns s.no , Page name, Link , status ?

    Regards,
    Tauseef

    1. Hi Tauseef,

      To achieve your objective, you have use looping mechanism so that in every iteration, Serial Number comes. For Page Name depends on your naming convention, Page Link get be acquired using driver.getCurrentUrl() and Status you can return as per execution flow end.

  13. Pradeep says:

    Hi Mukesh, I am facing issue with appium installation I have downloaded all the tools and jar files and installed but padnet, genymotion, appiumserver are not working in my system and selenium web driver is going fine
    Can you please help about this issues if you have done vedeos for this please share.

    I am following your vedeos and past from 3montgs it’s good and helped soo much…thanks

    1. Hi Pradeep,

      I’ll post few more videos very soon…:)

  14. preethi says:

    hi mukesh , pls make a video for jmeter

    1. Hi Preethi,

      Its already in pipeline. Please stay tuned…:)

  15. richa shastri says:

    Hi , what is good approach to handle stale element reference exceptions

  16. Daniel says:

    Why is the method verifyLinkActive public? I give methods the smallest scope needed, which would in this case be private. Is it necessary that other classes can use this method too?

    1. Hi Daniel,

      For the sake of simplicity, I made it as public otherwise when you use it into framework then you can implement proper access specifier for any method.

  17. Anshul Rajvanshi says:

    Hi Mukesh,
    Few questions.
    1) Opening google via Headless is working fine. But not able to open Secure sites – Any solution to it please?
    2) We are suing Selenium 3.X and ROBOT framework. Got to know htmlunit driver is not a part of the package.
    Do we have to add the PATH of htmlunit driver (downloaded separately) to PATH environments variable?

    1. Hi Anshul,

      Yes, you need to add htmlUnit driver separately to your project. Please check this link for more info

  18. Hash says:

    Hi Mukesh, how can I verify a text is highlighted or not using Selenium Webdriver.

    1. Hi Hash,

      Please use JavaScriptExecutor to verify background/foreground color verification.

  19. Partho Dutta says:

    Hi Mukesh
    In my application first i have to login then i have to find the broken link for entire application. Could he please suggest me how to do this?

    1. Hi Partho,

      I can’t see any difficulties in your mentioned scenario. Login window is usual and very generic activity. And once you login, then you can find all broken links. If this is not what you meant then kindly elaborate your requirement.

  20. Sri Datta says:

    Hi Mukesh,

    Im getting java.net.UnknownHostException this exception while running the code. Is it because of proxy issue? Please help.

    Thanks
    Sri Datta

    1. Yes Please check proxy setting.

  21. Vahe says:

    Hi Mukesh,
    The videos and the blog you have are great. I’m new in automation and I’m learning a lot from you. Very informative and very easy to understand. This code is working great, too. I just have a question maybe you can help me with that.
    Is there a way to test links only in the content of website? Can we exclude the header and footer links? Do you think it is possible?
    Thank you again!

    1. Hi Vahe,

      yes, it is possible to test links available in content of website instead of Header and Footer.

  22. Gursimran singh says:

    mera urlconnection nahi utha raha can u plz help me???

    1. Hi Gursimran,

      Could you please explain this?

  23. Saurabh says:

    Hello Mukesh,

    Good Job Mukesh,
    Mukesh I am getting ‘IndexOutOfBondException’ using this code if my web page having more than 300 links. Can you please suggest me what I can do in this case ?

    1. Hi Saurabh,

      This is java exception which when you refer an index in array which is not available. Or else you can take any List implementation like ArrayList or LinkedList whose length is dynamic in nature. Hope this should solve your problem.

  24. Vimal says:

    Mukesh,

    I’m very new to Selenium. i just came across your blogs and articles, it is quite amazing and very useful. So im interested to do the POC of my project in Selenium (Currently we are using QTP). i have many doubts by comparing QTP. Can you help me on this. If i get your email id, so that it will be helpful to share my doubts.

    1. Hi Vimal,

      Are you done with your POC?

  25. Vimal says:

    Hi Mukesh,

    when i try to find the broken links in my application, im getting following error.
    java.net.ConnectException: Connection refused: connect. Can you please help me on this.

    1. Hi Vimal,

      It comes with proxy so try to setup proxy and then run the same program.

  26. Vimal says:

    Hi Mukesh,

    I have doubt in the collection of link object. When i try to find the list of link from the page where the attribute is “A”. But in the collection of objects, am getting some null value for few anchor tag. how do we skip that.

    1. Hi Vimal,

      You can apply one more condition if a href is null then skip is null.

  27. Aman says:

    Hi Mukesh,
    very nice explanation
    Q- Can the same code be used to find response code 404?

    1. Aman says:

      Also , there is a separate lib for HttpResponse in Apache, Can it be used. Which is more easier Apache lib or yours method

      1. Both are same I downloaded separately but if you see it also comes with Selenium bundle.

    2. yes we can do that 🙂

  28. bhaskar says:

    kya bath kya bath

    sir,
    How to use Assertion After finding Broken Links and Images

  29. shreyas says:

    Hi Mukesh,

    public static void verifyLinkActive(String linkUrl)
    {
    try
    {
    URL url = new URL(linkUrl);

    What does linkURL does

    1. Hi Shreyas,

      When you access any url using network URL class will be used.

  30. Naresh says:

    Hi mukesh,thanks for this post,it is working fine for me.and keep posting this type of videos.

    1. Hey Naresh,

      Cheers 🙂

  31. suhana says:

    How to find broken images and links on a site on all pages.

    1. Hi Suhana,

      Kindly use src instead of href and use img tag. Rest piece of code will remain same.

  32. Shantosh says:

    Hi Mukesh,

    Well explained. Can you please share the tutorial on how to find the broken Images on the web page.

    Much thanks in advance

    1. Hi Shantosh,

      Thanks same code will work only make changes in getAttribute. Try src attribute.

  33. Jayshreekant says:

    It is checking the VeryFirst Link, After that is not checking the other links !

    1. set proxy and run again.

  34. Harshal says:

    hi mukesh,
    when i am running your code then i got 48 link count.
    Actual link present in consol-44
    If i run using Testng by removing main method and set priority to both method then i got error.
    org.testng.TestNGException:
    Method verifyLinkActive requires 1 parameters but 0 were supplied in the @Test annotation.

    1. Hi Harshal,

      please add parameter in method as well.

  35. Neha says:

    Well explained ..always been a fan of your tutorials.. Gr8 job

  36. c. says:

    I get this exception when I your code at Eclipse
    Exception in thread “main” java.lang.NoClassDefFoundError: com/google/common/base/Function
    at verifyBrokenLinks.main(verifyBrokenLinks.java:15)
    Caused by: java.lang.ClassNotFoundException: com.google.common.base.Function
    at java.net.URLClassLoader.findClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    … 1 more

    1. Hi Seleniu jars are not added completly.

  37. Abhishek Gupta says:

    Hi Mukesh,

    Can you explain why there is discrepancies in total no. of links. It always comes out to be more in no.
    refer Image : http://s15.postimg.org/hxsuih10b/screenshot_domain_date_time.png
    Also I believe in this depth level is limited to visited Page.

    1. Hi Abhishek,

      it might due to a tag.

  38. Nitin says:

    Hi Mukesh,

    Thank you for listing this amazing stuff on your blog. I really love the way you explore things that are so genuine and are in for everyday use. Thank you once again!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.