Web Application & REST API Integration Plugin

The plugin provides the integration between web application testing functionality and REST API features.

Installation

Copy the below line to dependencies section of the project build.gradle file

Please make sure to use the same version for all VIVIDUS dependencies.
Example 1. build.gradle
```
implementation(group: 'org.vividus', name: 'vividus-plugin-web-app-to-rest-api')
```
If the project was imported to the IDE before adding new dependency, re-generate the configuration files for the used IDE and then refresh the project in the used IDE.

Table Transformers

`FROM_SITEMAP`

FROM_SITEMAP transformer generates table based on the website sitemap.

The use of web-application.main-page-url property for setting of main page for crawling is deprecated and will be removed in VIVIDUS 0.7.0, please see either mainPageUrl transformer parameter or transformer.from-sitemap.main-page-url property.

The properties outlined in the table below are applied to all FROM_SITEMAP transformers utilized in the tests, unless they override the values by specifiying the corresponding table parameters.

Property Name Acceptable values Default Description

Property Name	Acceptable values	Default	Description
`transformer.from-sitemap.main-page-url`	URL		The main application page URL, used as initial seed URL that is fetched by the crawler to extract new URLs in it and follow them for crawling.
`transformer.from-sitemap.ignore-errors`	`true` `false`	`false`	Ignore sitemap parsing errors.
`transformer.from-sitemap.strict`	`true` `false`	`true`	Whether invalid URLs will be rejected silently (where "invalid" means that the URL is not under the base URL, see sitemap file location explanation for more details)
`transformer.from-sitemap.filter-redirects`	`true` `false`	`false`	defines whether urls that has redirect to the one that has already been included in the table are excluded from the table

transformer.from-sitemap.main-page-url

URL

The main application page URL, used as initial seed URL that is fetched by the crawler to extract new URLs in it and follow them for crawling.

transformer.from-sitemap.ignore-errors

true
false

false

Ignore sitemap parsing errors.

transformer.from-sitemap.strict

true
false

true

Whether invalid URLs will be rejected silently (where "invalid" means that the URL is not under the base URL, see sitemap file location explanation for more details)

transformer.from-sitemap.filter-redirects

true
false

false

defines whether urls that has redirect to the one that has already been included in the table are excluded from the table

The parameters outlined in the table below are exclusively applicable to the transformer in which they are declared.

Parameter Acceptable values Default Description

Parameter	Acceptable values	Default	Description
`mainPageUrl`	URL	If not specified, the default value is taken from `transformer.from-sitemap.main-page-url` property.	The main application page URL, used as initial seed URL that is fetched by the crawler to extract new URLs in it and follow them for crawling.
`siteMapRelativeUrl`	Relative URL		The relative URL of `sitemap.xml`.
`column`			The column name to store collected relative URLs in the generated table.
`ignoreErrors`	`true` `false`	If not specified, the default value is taken from `transformer.from-sitemap.ignore-errors` property.	Ignore sitemap parsing errors.
`strict`	`true` `false`	If not specified, the default value is taken from `transformer.from-sitemap.strict` property.	Whether invalid URLs will be rejected silently (where "invalid" means that the URL is not under the base URL, see sitemap file location explanation for more details)

mainPageUrl

URL

If not specified, the default value is taken from transformer.from-sitemap.main-page-url property.

The main application page URL, used as initial seed URL that is fetched by the crawler to extract new URLs in it and follow them for crawling.

siteMapRelativeUrl

Relative URL

The relative URL of sitemap.xml.

column

The column name to store collected relative URLs in the generated table.

ignoreErrors

true
false

If not specified, the default value is taken from transformer.from-sitemap.ignore-errors property.

Ignore sitemap parsing errors.

strict

true
false

If not specified, the default value is taken from transformer.from-sitemap.strict property.

Whether invalid URLs will be rejected silently (where "invalid" means that the URL is not under the base URL, see sitemap file location explanation for more details)

Example 2. Usage example

Examples:
{transformer=FROM_SITEMAP, siteMapRelativeUrl=/sitemap.xml, ignoreErrors=true, strict=false, column=page-url}

FROM_HEADLESS_CRAWLING

FROM_HEADLESS_CRAWLING transformer generates table based on the results of headless crawling.

Property Name Acceptable values Default Description

General

transformer.from-headless-crawling.main-page-url

URL

main application page URL, used as initial seed URL that is fetched by the crawler to extract new URLs in it and follow them for crawling.

The main page url value defined by this property gets overriden by the value defined in mainPageUrl transformer parameter.

transformer.from-headless-crawling.seed-relative-urls

Comma-separated list of values

List of relative URLs, a seed URL is a URL that is fetched by the crawler to extract new URLs in it and follow them for crawling.

transformer.from-headless-crawling.exclude-urls-regex

Regular expression a

.*(css|gif|gz|ico|jpeg|jpg|js|mp3|mp4|pdf|png|svg|zip|woff2 |woff|ttf|doc|docx|xml|json|webmanifest|webp)$

The regular expression to match URLs. The crawler will not crawl all URLs that matching the given regular expression and they will not be added to the resulting table. URI fragments and URL query are ignored at filtering.

transformer.from-headless-crawling.exclude-extensions-regex

The property is deprecated in favor of transformer.from-headless-crawling.exclude-urls-regex and will be removed in VIVIDUS 0.7.0.

Regular expression

no default value

The regular expression to match extensions in URLs. The crawler will ignore all URLs referring to files with extensions matching the given regular expression.

transformer.from-headless-crawling.filter-redirects

true false

false

Defines whether urls that has redirect to the one that has already been included in the table are excluded from the table.

transformer.from-headless-crawling.socket-timeout

integer

40000

Socket timeout in milliseconds.

transformer.from-headless-crawling.connection-timeout

integer

30000

Connection timeout in milliseconds.

transformer.from-headless-crawling.max-download-size

integer

1048576

Max allowed size of a page in bytes. Pages larger than this size will not be fetched.

transformer.from-headless-crawling.max-connections-per-host

integer

100

Maximum connections per host.

transformer.from-headless-crawling.max-total-connections

integer

100

Maximum total connections.

transformer.from-headless-crawling.follow-redirects

true / false

true

Whether to follow redirects.

transformer.from-headless-crawling.max-depth-of-crawling

integer

-1

Maximum depth of crawling, for unlimited depth this parameter should be set to -1.

transformer.from-headless-crawling.max-pages-to-fetch

integer

-1

Number of pages to fetch, for unlimited number of pages this parameter should be set to -1.

transformer.from-headless-crawling.politeness-delay

integer

0

Politeness delay in milliseconds between sending two requests to the same host.

transformer.from-headless-crawling.max-outgoing-links-to-follow

integer

5000

Max number of outgoing links which are processed from a page.

transformer.from-headless-crawling.respect-no-follow

true false

false

Whether to honor links with nofollow flag.

transformer.from-headless-crawling.respect-no-index

true false

false

Whether to honor links with noindex flag.

transformer.from-headless-crawling.user-agent-string

string

crawler4j (https://github.com/rzo1/crawler4j/)

User agent.

transformer.from-headless-crawling.cookie-policy

ignore, standard, relaxed

no default value

Cookie policy as defined per cookie specification.

transformer.from-headless-crawling.allow-single-level-domain

true false

false

Whether to consider single level domains valid (e.g. http://localhost).

transformer.from-headless-crawling.include-https-pages

true false

true

Whether to crawl https pages.

transformer.from-headless-crawling.http.headers.<header name>=<header value>

Set of headers to set for every crawling request being sent.

transformer.from-headless-crawling.http.headers.x-vercel-protection-bypass=1fac2b25014d35e5103b

Proxy

transformer.from-headless-crawling.proxy-host

URL

no default value

Proxy host.

transformer.from-headless-crawling.proxy-port

integer

80

Proxy port.

transformer.from-headless-crawling.proxy-username

string

no default value

Username to authenticate with proxy.

transformer.from-headless-crawling.proxy-password

string

no default value

Password to authenticate with proxy.

Parameter Name Description

mainPageUrl

main application page URL, used as initial seed URL that is fetched by the crawler to extract new URLs in it and follow them for crawling.

The main page url value defined by this parameter overrides the value defined by the transformer.from-headless-crawling.main-page-url property.

column

The column name in the generated table.

Example 3. Usage example

Examples:
{transformer=FROM_HEADLESS_CRAWLING, column=page-url}

FROM_HTML

The transformer has been moved to vividus-plugin-html. The transformer is still available via vividus-plugin-web-app-to-rest-api, but it will be required to install vividus-plugin-html explicitly starting from VIVIDUS 0.7.0.

Steps

Resources validations

Steps to check resource availability using HTTP requests.

Resource validation statuses

Status Description

Status	Description
`FAILED`	An HTTP request to the resource returns a status code other than 200 OK.
`BLOCKED`	An HTTP request has been blocked by Akamai Web Application Firewall and requires manual validation.
`BROKEN`	Reasons: an HTTP request to the page under test returns an empty HTTP response body; an HTTP request to the page under test results in unexpected error; the relative page URL can not be resolved because the `web-application.main-page-url` property is not set; the resource has invalid URL format; the resource is missing `href` or `src` attributes; the resource has `href` or `src` attribute but its value is not a valid URL; the resource is a jump link that points to non-existent jump target.
`PASSED`	An HTTP request to the resource returns 200 OK status code.
`FILTERED`	Reasons: the resource path matches the patterns specified by the `resource-checker.uri-to-ignore-regex` property; the resource path is equal to `#` (anchor); the resource is not a HTTP(S) resource; the resource is jump link which cannot be verified from the current context (if only part of the document is checked).
`SKIPPED`	A resource validation has already been performed, i.e. if the same resource might be present on several pages so we do not need to validate it twice. `FAILED` and `BROKEN` resource validations will still have their respective statuses despite the validations have been already performed.

FAILED

An HTTP request to the resource returns a status code other than 200 OK.

BLOCKED

An HTTP request has been blocked by Akamai Web Application Firewall and requires manual validation.

BROKEN

Reasons:

an HTTP request to the page under test returns an empty HTTP response body;
an HTTP request to the page under test results in unexpected error;
the relative page URL can not be resolved because the web-application.main-page-url property is not set;
the resource has invalid URL format;
the resource is missing href or src attributes;
the resource has href or src attribute but its value is not a valid URL;
the resource is a jump link that points to non-existent jump target.

PASSED

An HTTP request to the resource returns 200 OK status code.

FILTERED

Reasons:

the resource path matches the patterns specified by the resource-checker.uri-to-ignore-regex property;
the resource path is equal to # (anchor);
the resource is not a HTTP(S) resource;
the resource is jump link which cannot be verified from the current context (if only part of the document is checked).

SKIPPED

A resource validation has already been performed, i.e. if the same resource might be present on several pages so we do not need to validate it twice. FAILED and BROKEN resource validations will still have their respective statuses despite the validations have been already performed.

Properties

Property Name Acceptable values Default Description

Property Name	Acceptable values	Default	Description
`resource-checker.attributes-to-check`	HTML Attributes	`href,src`	A comma-separated list of HTML attributes where resource links will be searched. If values present in multiple attributes, the link will be fetched from the first one.
`resource-checker.publish-response-body`	`true` `false`	`false`	Whether to attach the HTTP response body for HTTP calls with non-successful status codes (may reduce performance).

resource-checker.attributes-to-check

HTML Attributes

href,src

A comma-separated list of HTML attributes where resource links will be searched. If values present in multiple attributes, the link will be fetched from the first one.

resource-checker.publish-response-body

true false

false

Whether to attach the HTTP response body for HTTP calls with non-successful status codes (may reduce performance).

Validate resources on web pages

Validates resources on web pages.

Resource validation logic:

If the pages row contains relative URL then it gets resolved against URL in web-application.main-page-url property, i.e. if the main page URL is https://elderscrolls.bethesda.net/ and relative URL is /skyrim10 the resulting URL will be https://elderscrolls.bethesda.net/skyrim10
Collect elements by the CSS selector from each page
Get either href or src attribute value from each element, if neither of the attributes exists the validation fails
For each received value execute HEAD request
1. If the status code is 200 OK then the resource validation is considered as passed
2. If the status code is one of 404 Not Found, 405 Method Not Allowed, 501 Not Implemented, 503 Service Unavailable then GET request will be executed
3. If the GET status code is 200 OK then the resource validation is considered as passed, otherwise failed

Then all resources found by $htmlLocatorType `$htmlLocator` are valid on:$pages

Deprecated syntax (will be removed in VIVIDUS 0.7.0):

Then all resources by selector `$cssSelector` are valid on:$pages

$htmlLocatorType - The HTML locator type, either CSS selector or XPath.
$htmlLocator - The actual locator.
1. $pages - The pages to validate resources on.

Example 4. Validate resources located by XPath

Then all resources found by xpath `//a` are valid on:
|pages                        |
|https://vividus.org/         |
|/test-automation-made-awesome|

Validate resources from HTML

Validates resources from HTML document.

Resource validation logic:

Collects elements by the CSS selector from the specified HTML document
Get either href or src attribute value from each element, if neither of the attributes exists the validation fails. If the element value contains relative URL then it gets resolved against URL in web-application.main-page-url property
For each received value execute HEAD request
1. If the status code is 200 OK then the resource validation is considered as passed
2. If the status code is one of 404 Not Found, 405 Method Not Allowed, 501 Not Implemented, 503 Service Unavailable then GET request will be executed
3. If the GET status code is 200 OK then the resource validation is considered as passed, otherwise failed

Then all resources found by $htmlLocatorType `$htmlLocator` in $html are valid

Deprecated syntax (will be removed in VIVIDUS 0.7.0):

Then all resources by selector `$cssSelector` from $html are valid

$htmlLocatorType - The HTML locator type, either CSS selector or XPath.
$htmlLocator - The actual locator.
1. $html - HTML document to validate.

Example 5. Validate resources from the current page

Then all resources found by CSS selector `a,img` in ${source-code} are valid

Validate redirects

The step has been moved to vividus-plugin-rest-api.

Validate SSL rating

Performs SSL scanning using SSL Labs and compares received grade value with expected one.

Then SSL rating for URL `$url` is $comparisonRule `$gradeName`

$url - The URL for SSL scanning and grading.
$comparisonRule - The comparison rule.
$gradeName - The name of grade. The possible values: A+, A, A-, B, C, D, E, F, T, M.

Table 1. Properties
Property Name	Acceptable values	Default	Description
`ssl-labs.api-endpoint`	`URL`	`https://api.ssllabs.com`	SSL Labs endpoint.

Example 6. Validate SSL rating for https://www.google.com

Then SSL rating for URL `https://www.google.com` is equal to `B`

Cookie Steps

Set browser cookies to the HTTP context

Set current browser cookies to the HTTP client for future executing HTTP requests.

When I set browser cookies to HTTP context

Example 7. Open web page and send GET request with cookies from the opened page

Given I am on main application page
When I set browser cookies to HTTP context
When I execute HTTP GET request for resource with relative URL `/get`

Set HTTP cookies to the browser

Set current cookies from the previous HTTP calls to the browser. Make sure that expected cookies from HTTP calls matches with web browser opened page domain. The actions performed by the step:

add the cookies;
refresh the current page (this action is required to apply the changes in cookies).

When I set HTTP context cookies to browser

Example 8. Send GET request and then put received cookies to the browser

When I execute HTTP GET request for resource with relative URL `/get`
Given I am on main application page
When I set HTTP context cookies to browser

Set HTTP cookies to the browser without applying changes

Set current cookies from the previous HTTP calls to the browser, but does not apply the changes in cookies instantly. The current page must be refreshed or the navigation must be performed to apply the cookie changes. Make sure that expected cookies from HTTP calls matches with web browser opened page domain.

When I set HTTP context cookies to browser without applying changes

Example 9. Send GET request and then put received cookies to the browser but without refreshing the page

When I execute HTTP GET request for resource with relative URL `/get`
Given I am on main application page
When I set HTTP context cookies to browser without applying changes