Selenium is a popular open-source framework for automating web browsers, often used for web testing. Here’s a structured guide to learning Selenium basics that will help you prepare for interviews:
1. Introduction to Selenium
- What is Selenium? Selenium automates browsers, enabling you to perform tasks like form submissions, navigation, clicks, and more, which are often used in testing web applications.
- Why use Selenium? It supports multiple programming languages (Java, Python, C#, etc.) and browsers (Chrome, Firefox, Safari).
2. Selenium Components
- Selenium WebDriver: Automates browser actions like clicking buttons or navigating pages.
- Selenium IDE: A record-and-playback tool, primarily used for basic test recording.
- Selenium Grid: Allows parallel test execution on different machines or browsers.
- Selenium RC (Remote Control): Older version, largely replaced by WebDriver.
3. Setup
- Install WebDriver for your browser (ChromeDriver, GeckoDriver for Firefox).
- Install Selenium Library for the language you are using (e.g., Python:
pip install selenium
, Java: add Selenium JAR to your project).
4. Basic Commands
Launching a browser:
from selenium import webdriver
driver = webdriver.Chrome() # or Firefox, Safari, etc.
driver.get('http://google.com') # open the URL
Locating elements: Selenium provides several ways to locate web elements:
find_element_by_id()
find_element_by_name()
find_element_by_xpath()
find_element_by_css_selector()
element = driver.find_element_by_id('element_id')
- Clicking an element:
button = driver.find_element_by_name('submit')
button.click()
- Typing into an input field:
input_field = driver.find_element_by_name('username')
input_field.send_keys('testuser')
- Submitting a form:
form = driver.find_element_by_id('login-form')
form.submit()
- Navigating between pages:
driver.back() # Go back
driver.forward() # Go forward
5. Handling Web Elements
- Dropdowns:
from selenium.webdriver.support.ui import Select
select = Select(driver.find_element_by_name('dropdown'))
select.select_by_value('value1')
- Checkboxes & Radio Buttons:
checkbox = driver.find_element_by_id('agree')
checkbox.click()
- Handling Alerts:
alert = driver.switch_to.alert
alert.accept() # Accept the alert
6. Handling Waits
Selenium supports two types of waits:
- Implicit Wait: Global wait until elements are available.
driver.implicitly_wait(10) # Wait for 10 seconds
- Explicit Wait: Wait until a condition is met for specific elements.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, 'element_id'))
)
7. Page Navigation and Windows
- Handling multiple windows:
driver.switch_to.window(driver.window_handles[1]) # Switch to new tab/window
- Switching to an iframe:
driver.switch_to.frame('frame_name')
driver.switch_to.default_content() # Switch back to main content
8. Advanced WebDriver Features
Handling Multiple Windows
Selenium can interact with multiple windows or tabs opened by a browser.
# Get current window handle
main_window = driver.current_window_handle
# Open a new window and switch to it
driver.execute_script("window.open('http://example.com', '_blank');")
windows = driver.window_handles
driver.switch_to.window(windows[1]) # Switch to new window
# Switch back to the main window
driver.switch_to.window(main_window)
Switching Between Frames (iframes)
Switching between iframes is crucial when dealing with pages that have embedded content (like ads or other documents).
# Switch to iframe
driver.switch_to.frame('iframe_name')
# Perform actions inside iframe
element = driver.find_element_by_id('element_inside_iframe')
# Switch back to the main content
driver.switch_to.default_content()
9. Advanced Locators
Using Dynamic XPath and CSS Selectors
Dynamic XPath can handle complex DOM structures and elements that don’t have static attributes.
# XPath example with text content
driver.find_element_by_xpath("//button[text()='Submit']")
# XPath with contains() to handle partial matches
driver.find_element_by_xpath("//a[contains(@href, 'part_of_link')]")
# CSS Selector with dynamic attribute matching
driver.find_element_by_css_selector("input[name*='partial_name']")
10. JavaScript Execution
Selenium WebDriver can execute JavaScript in the context of the web page for scenarios where normal WebDriver actions are insufficient.
# Execute JavaScript
driver.execute_script("document.getElementById('submit').click();")
# Scroll the page using JavaScript
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
11. Page Object Model (POM)
The Page Object Model is a design pattern that enhances the structure of test automation by separating test logic from UI interaction.
- Test Class: Contains test methods, calling methods from the page object.
- Page Object: A class representing a web page, encapsulating the web elements and actions for that page.
Example of Page Object:
class LoginPage:
def __init__(self, driver):
self.driver = driver
def enter_username(self, username):
self.driver.find_element_by_id('username').send_keys(username)
def enter_password(self, password):
self.driver.find_element_by_id('password').send_keys(password)
def click_login(self):
self.driver.find_element_by_id('login_button').click()
Test Case:
def test_login():
driver = webdriver.Chrome()
driver.get('http://example.com/login')
login_page = LoginPage(driver)
login_page.enter_username('testuser')
login_page.enter_password('password123')
login_page.click_login()
# Add assertions to validate login
assert "Dashboard" in driver.title
12. Handling File Uploads and Downloads
driver.find_element_by_id('file-upload').send_keys('C:/path_to_file/file.txt')
File Download:
options = webdriver.ChromeOptions()
prefs = {'download.default_directory': 'C:/Downloads'}
options.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(options=options)
Basic Interview Topics
- Advantages of Selenium: Open-source, supports multiple languages and browsers, cross-browser testing, etc.
- Limitations: No built-in reporting, can’t handle desktop applications.
- Locators: Understand all locators—
id
,name
,xpath
,cssSelector
. - Wait Mechanisms: Difference between implicit and explicit waits.
- Page Object Model (POM): Popular design pattern for organizing test scripts in a structured way.
- TestNG (Java): Often used in Selenium for testing, includes features like annotations and parallel testing.
Practice Tasks
- Automate a login process on a demo site.
- Practice locating elements using different locators (ID, Xpath, CSS Selector).
- Write a script that waits for a certain element to appear before interacting with it.