Cybersecurity - Chapter 1
Chapter 1
Web Applications Foundation:
SDLC: Software Development Life Cycle.
Software Development Life Cycle (SDLC) is
a frame work that defines activity that are performed during software
development process.
There are 6
phases in SDLC model as given below.
·
Requirement:
In this phase all the requirement are collected from customer/client. They are provided in a document called Businessmen requirement specification (BRS) and System requirement specification (SRS). All the detail are discuss with customer/client in detail.
In this phase all the requirement are collected from customer/client. They are provided in a document called Businessmen requirement specification (BRS) and System requirement specification (SRS). All the detail are discuss with customer/client in detail.
·
Design:
It has two steps:
It has two steps:
High level design (HLD): It give the architecture of
software product.
Low level design (LLD): It describe how each and
every feature in the product should work and every
component.
·
Implementation:
This is the longest phase.
This phase consists of Front end + Middle ware
+ Back-end
In font end: development coding are done even
SEO setting are done
In Middle ware: They connect both font end and
back end
In back-end: database is created
·
Testing:
Testing is carried out to verify the entire system. The aim of the tester is to find out the gaps and defects within the system and also to check whether the system is running according to the requirement of the customer/client.
Testing is carried out to verify the entire system. The aim of the tester is to find out the gaps and defects within the system and also to check whether the system is running according to the requirement of the customer/client.
·
Deployment:
After successful testing the product is delivered/deployed to the client, even client are trained how to use the product.
After successful testing the product is delivered/deployed to the client, even client are trained how to use the product.
·
Maintenance:
Once the product has been delivered to the client a task of maintenance start as when the client will come up with an error the issue should be fixed from time to time.
Once the product has been delivered to the client a task of maintenance start as when the client will come up with an error the issue should be fixed from time to time.
Client-server architecture & Understanding HTTP
HyperText Transfer Protocol is one
of the fundamentals of the internet. Every web developer, even front-end
developers should at least have a basic understanding of what HTTP is. It is
the mechanism which enabled computers talk to each other over the internet. It
defines the format in which messages are passed on the internet. It is
absolutely necessary for every web developer.
Background
Okay, so the internet began as a
government project of the US around the 1960s-80s and reached commercial
adoption around the 1990s. The internet is a huge interconnected web of
computers. Each computer is assigned a unique address known as an IP address
(e.g. 182.179.188.131). Computers on a network identify other computers on the
same network by their unique IP addresses.
So, we have a mesh of wires
connecting different computers. These wires go through oceans, through the
ground, over and under buildings in many cases. How can we ensure that the
signal sent by one computer will reach the other computer even though there may
be many twists and turns and possibly other blockages in the wire?
That’s where the Transmission
Control Protocol (TCP) comes in. The TCP protocol is designed to ensure that
messages (packets as we call them computer networks) are reliably sent from one
computer to another over a network. TCP forms the backbone of most of the
internet.
Now, TCP’s responsibility is to
ensure that any packet sent by a computer reaches reliably at its target IP /
computer. However, TCP does not define the structure of the message. The
structure of the messages being passed on the internet are defined by HTTP.
Client-server architecture
To understand HTTP we must first
understand client-server architecture. In a client-server architecture we have
one system or computer that is the client. The client requests some information
from the server. The server is a system or computer whose job is to respond to
client’s requests with the relevant data.
The HTTP mechanism is comprised of
a request-response cycle. The client can send an HTTP request to the server and
it’s the server’s job to respond to the request with relevant data. Most HTTP
responses contain HTML.
The internet is simply a huge
network of clients and servers. If you type alazierplace.com into your
browser’s address bar and press enter, your browser is the client and it will
send an HTTP request to my server. My server then responds to the HTTP request
by sending the HTML of my website in the response.
Types of HTTP Requests
Every HTTP request is a simple
text file which is formatted in a certain way. The formatting is very well
defined and documented by the W3C in the HTTP specification. If you ever want
to know why something works the way it does in HTTP, The HTTP specification is
the best place to find out. It is somewhat cumbersome to navigate it but I
learn something new every time I have to check a reference for something. So,
RTFM.
The HTTP specification defines a
list of request types that are used to request data from HTTP servers. These
are formally known as HTTP methods or verbs. The method or verb signifies the
intent of the client when making a request.
Here are the most common HTTP methods and their conventional use
cases:
GET: GET
requests are the most common types of requests on the web. They simply retrieve
data from a server. 99% of the time they are used to request HTML. Every time
you visit any website in your browser, the browser sends a GET request for the
content of the site. These requests do not have a request body. Any arbitrary
information sent through GET requests has to be appended to the URL as a query
string. This makes GET requests very insecure for sending sensitive information
such as emails and passwords.
POST: POST
requests are used to send arbitrary amounts of data to the server. The data is
sent as the request body and not in the query string. By convention, POST
requests are used when a new entity is to be created on the server from
submitted data.
PUT: PUT
requests are identical to POST requests in that they are used to submit data to
the server. However, the convention is that PUT requests should be used when we
have to completely replace some already existing entity on the server.
PATCH: PATCH
requests are used to update some existing entity on the server. They submit
data in the request body similar to POST and PUT requests.
DELETE: DELETE
requests are similar to GET requests in that they have no body. They are the
opposite of GET. While GET is used to retrieve data, DELETE requests are used
to delete data form the server (no duh?).
These are some of the common HTTP
headers used. There are others like OPTIONS, HEAD, TRACE. Details here.
Structure of HTTP Requests
The structure of an HTTP request
is broken down into the following parts.
The request line
Zero or more headers
A blank line denoting the
beginning of the body
An optional message body
The following is an example of an HTTP GET request your browser sends
when you type google.com into the browser.
GET / HTTP/2
Host: www.google.com
authority: www.google.com
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
accept-encoding: gzip, deflate
accept-language: en-US,en;q=0.9
And here is an example of and HTTP POST request with data in the
body.
POST /test HTTP/1.1
Host: foo.example
Content-Type:
application/x-www-form-urlencoded
Content-Length: 27
field1=value1&field2=value2
There is a subtle difference
between GET and POST which I will explain soon. As you can see, an HTTP request
is nothing more than plain text with some specific formatting. Let’s dissect
this file.
Request Line
The very first line in an HTTP
request is know as the request line.
GET / HTTP/2
It comprises of 3 parts:
The very first is the HTTP method
used. It can be GET, PUT, POST, OPTIONS or anyone of these values. The method
defines the type of request being sent to the server and dictates what data can
or cannot be passed with the request.
Second, is the request URI. / and
/test in our examples. This defines what file on the server the client is
requesting.
Third, is the HTTP protocol
version.
HTTP Headers
After the request line, we have a
series of key-value pairs separated by a :. These are known as HTTP headers and
are case-insensitive. Headers are how additional information is sent to the
server. The complete list of compatible header fields can be found here. HTTP
headers are well defined and can only contain specific values. It is possible
to define custom headers using the X- syntax, e.g.
X-Powered-By: PHP/5.2.17
However it is not very widely
used. Ever wondered how browser cookies are sent along with the request? A
cookies header. Headers define meta information about the request, like what is
the content-type of the request body. What are the acceptable response types etc.
Request Body
The last and final part of a
request is the body. After the headers, there is a blank line which signifies
that the headers are finished and the next data is the request body. The
request body is used to send arbitrary data to the server. It could be JSON,
simple text or even binary code.
The body is only enabled on
certain HTTP methods. GET requests don’t have a body. POST, PUT and PATCH
requests do. The type of data in the body is defined by the Content-Type
header. If you’re sending JSON data to the server then the content-type would
be application/json.
Structure of HTTP Responses
As stated, HTTP requests and
responses are just plain text files. So the structure of an HTTP response is
similar to a request with some differences. They also consist of 3 parts.
A status line
Zero or more headers
A blank line denoting the
beginning of the body
An optional message body
Below is an example of an HTTP
response.
HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Sun, 19 May 2019 12:50:50
GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Sun, 19 May 2019
12:49:34 GMT
Connection: keep-alive
ETag: "5ce150de-264"
Accept-Ranges: bytes
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma,
Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully
installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a
href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a
href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
Let’s break it down shall we?
Status Line
The status line is the first thing
in an HTTP response
HTTP/1.1 200 OK
In order, it has 2 parts:
The HTTP version
The HTTP status code and its
textual meaning
In the above example, HTTP/1.1 is
the version and 200 OK is the status code and text.
Response Headers
Just like with the request, HTTP
responses contain headers which contain metadata about the response. These are
simply key-value pairs separated by a semi colon :.
We can see that the Content-Type
header in the above example contains the value text/html denoting that is will
contain HTML.
Response Body
The response body is such that it
contains the actual response content sent by the server. Almost every HTTP
response has a response body. The content of the response can be virtually
anything. HTML, JSON, XML, you name it. However, mostly the HTTP responses
contain HTML or JSON.
In the above example, the response
body starts after the first blank and the response body contains the actual
HTML that will be sent to the browser.
HTTP Status Codes
One of the most popular and useful
things about HTTP is its use of status codes. Status code indicate the status
of the requested resource or document. The status code is simply the number in
the status line of a response and is always in the hundreds range, e.g. 200,
302, 404 etc.
Here are some of the most popular status codes. You might’ve seem some of
them on the web.
200 OK: Indicates a successful
request and returns the response.
301 Moved Permanently: The
resource has been moved to a new location. Mostly used to redirect to the
resource.
400 Bad Request: Used when there
is an error in the HTTP request.
403 Forbidden: Indicates that the
client does not have the necessary authorization to access the resource.
404 Not Found: The resource is not
found.
418 I’m a teapot: Used when the
server refuses to brew coffee because it’s a teapot. (Yes, this is real).
500 Internal Server Error: Raised
when the web server encounters and error that is not handled properly.
And the full list of codes can be
found at the end of this article.
Security
Now, there is a lot to be said
about security in HTTP. HTTP does provide basic access authentication. What
that means is that the client would have to supply a username and a password to
get access to the resource on the server. But that is just a rudimentary
mechanism and we need better security in modern apps.
HTTP was created in such a way
that makes it easy to read by humans (i.e. uses plain text and formatting).
However, this introduces some complications. By default HTTP does not encode or
encrypt the message or headers in anyway. So, if a request is sent to a server
through a client, that request goes through a network of routers and switches
and servers. Any one of those servers can read the content or message of the
request or response. So if I send the username and password of my bank account
through an HTTP post request, any router or server that comes between the
bank’s server and my computer can read my credentials. There are ways to overcome
this which I will talk about below.
A Stateless protocol. HTTP is
designed as a stateless protocol. What this means is that in a pure HTTP
request/response cycle, the client and the server do not retain any information
about each other. It is not the responsibility of the HTTP protocol itself to
recognize the authenticity of the client or the server. While this decision has
decreased the implementation complexity of the protocol, I believe it
introduces chances of attacks such as phishing attacks and cross-site request
forgery attacks. So web developers have developed techniques to ensure such
security. In my opinions, such security measures should be built-in to the
protocol.
Best Practices
Here I will list down what I think
are best practices when using HTTP. Whether it be spinning up a traditional
template based app, creating an API or creating a microservices based
architecture, here is what I think we should do.
Use proper HTTP status codes: HTTP
status codes were defined for a reason, and they form a standard set of rules
that most systems on the web follow. So we should use proper HTTP codes in our
APIs and servers. If it’s a success return 200, if it’s a redirect return 302
and if it’s not there, return 404. This enables us to build more robust clients
and servers.
Always use SSL/TSL TLS: As I said,
HTTP traffic is plain text so anyone or any system can read it. To prevent this
we should always use an SSL certificate to ensure that the request and response
are completely encrypted and only readable to the intended client and server.
Use CSRF security: HTTP being
stateless, we can’t ensure the validity of clients on servers. So we should use
a CSRF token or similar mechanism to prevent cross-site request forgery
attacks.
Use JWTs for secure APIs: When building
APIs the best way to grant secure access to clients are through JSON web
tokens. JWTs should be used with a secure algorithm and secret key. Also, they
should be provided in the Authorization header instead of browser cookies.
Complete list of HTTP Response Status Codes
HTTP response status codes
indicate whether a specific HTTP request has been successfully completed.
Responses are grouped in five classes:
1. Informational
responses (100–199),
2. Successful
responses (200–299),
3. Redirects
(300–399),
4. Client
errors (400–499),
5. and
Server errors (500–599).
The below status codes are defined
by section 10 of RFC 2616. You can find an updated specification in RFC 7231.
If you receive a response that is not in this list, it is a
non-standard response, possibly custom to the server's software.
Information responses
100 Continue
This interim response indicates
that everything so far is OK and that the client should continue the request,
or ignore the response if the request is already finished.
101 Switching Protocol
This code is sent in response to
an Upgrade request header from the client, and indicates the protocol the
server is switching to.
102 Processing (WebDAV)
This code indicates that the
server has received and is processing the request, but no response is available
yet.
103 Early Hints
This status code is primarily
intended to be used with the Link header, letting the user agent start
preloading resources while the server prepares a response.
Successful responses
200 OK
The request has succeeded. The
meaning of the success depends on the HTTP method:
• GET:
The resource has been fetched and is transmitted in the message body.
• HEAD:
The entity headers are in the message body.
• PUT
or POST: The resource describing the result of the action is transmitted in the
message body.
• TRACE:
The message body contains the request message as received by the server
201 Created
The request has succeeded and a
new resource has been created as a result. This is typically the response sent
after POST requests, or some PUT requests.
202 Accepted
The request has been received but
not yet acted upon. It is noncommittal, since there is no way in HTTP to later
send an asynchronous response indicating the outcome of the request. It is
intended for cases where another process or server handles the request, or for
batch processing.
203 Non-Authoritative Information
This response code means the
returned meta-information is not exactly the same as is available from the
origin server, but is collected from a local or a third-party copy. This is
mostly used for mirrors or backups of another resource. Except for that
specific case, the "200 OK" response is preferred to this status.
204 No Content
There is no content to send for
this request, but the headers may be useful. The user-agent may update its
cached headers for this resource with the new ones.
205 Reset Content
Tells the user-agent to reset the
document which sent this request.
206 Partial Content
This response code is used when
the Range header is sent from the client to request only part of a resource.
207 Multi-Status (WebDAV)
Conveys information about multiple
resources, for situations where multiple status codes might be appropriate.
208 Already Reported (WebDAV)
Used inside a <dav:propstat>
response element to avoid repeatedly enumerating the internal members of
multiple bindings to the same collection.
226 IM Used (HTTP Delta encoding)
The server has fulfilled a GET
request for the resource, and the response is a representation of the result of
one or more instance-manipulations applied to the current instance.
Redirection messages
300 Multiple Choice
The request has more than one
possible response. The user-agent or user should choose one of them. (There is
no standardized way of choosing one of the responses, but HTML links to the
possibilities are recommended so the user can pick.)
301 Moved Permanently
The URL of the requested resource
has been changed permanently. The new URL is given in the response.
302 Found
This response code means that the
URI of requested resource has been changed temporarily. Further changes in the
URI might be made in the future. Therefore, this same URI should be used by the
client in future requests.
303 See Other
The server sent this response to
direct the client to get the requested resource at another URI with a GET
request.
304 Not Modified
This is used for caching purposes.
It tells the client that the response has not been modified, so the client can
continue to use the same cached version of the response.
305 Use Proxy
Defined in a previous version of
the HTTP specification to indicate that a requested response must be accessed
by a proxy. It has been deprecated due to security concerns regarding in-band
configuration of a proxy.
306 unused
This response code is no longer
used; it is just reserved. It was used in a previous version of the HTTP/1.1
specification.
307 Temporary Redirect
The server sends this response to
direct the client to get the requested resource at another URI with same method
that was used in the prior request. This has the same semantics as the 302
Found HTTP response code, with the exception that the user agent must not
change the HTTP method used: If a POST was used in the first request, a POST
must be used in the second request.
308 Permanent Redirect
This means that the resource is
now permanently located at another URI, specified by the Location: HTTP
Response header. This has the same semantics as the 301 Moved Permanently HTTP
response code, with the exception that the user agent must not change the HTTP
method used: If a POST was used in the first request, a POST must be used in
the second request.
Client error responses
400 Bad Request
The server could not understand
the request due to invalid syntax.
401 Unauthorized
Although the HTTP standard
specifies "unauthorized", semantically this response means
"unauthenticated". That is, the client must authenticate itself to
get the requested response.
402 Payment Required
This response code is reserved for
future use. The initial aim for creating this code was using it for digital
payment systems, however this status code is used very rarely and no standard
convention exists.
403 Forbidden
The client does not have access
rights to the content; that is, it is unauthorized, so the server is refusing
to give the requested resource. Unlike 401, the client's identity is known to
the server.
404 Not Found
The server can not find the
requested resource. In the browser, this means the URL is not recognized. In an
API, this can also mean that the endpoint is valid but the resource itself does
not exist. Servers may also send this response instead of 403 to hide the
existence of a resource from an unauthorized client. This response code is
probably the most famous one due to its frequent occurrence on the web.
405 Method Not Allowed
The request method is known by the
server but has been disabled and cannot be used. For example, an API may forbid
DELETE-ing a resource. The two mandatory methods, GET and HEAD, must never be
disabled and should not return this error code.
406 Not Acceptable
This response is sent when the web
server, after performing server-driven content negotiation, doesn't find any
content that conforms to the criteria given by the user agent.
407 Proxy Authentication Required
This is similar to 401 but
authentication is needed to be done by a proxy.
408 Request Timeout
This response is sent on an idle
connection by some servers, even without any previous request by the client. It
means that the server would like to shut down this unused connection. This
response is used much more since some browsers, like Chrome, Firefox 27+, or
IE9, use HTTP pre-connection mechanisms to speed up surfing. Also note that
some servers merely shut down the connection without sending this message.
409 Conflict
This response is sent when a
request conflicts with the current state of the server.
410 Gone
This response is sent when the
requested content has been permanently deleted from server, with no forwarding
address. Clients are expected to remove their caches and links to the resource.
The HTTP specification intends this status code to be used for
"limited-time, promotional services". APIs should not feel compelled
to indicate resources that have been deleted with this status code.
411 Length Required
Server rejected the request
because the Content-Length header field is not defined and the server requires
it.
412 Precondition Failed
The client has indicated
preconditions in its headers which the server does not meet.
413 Payload Too Large
Request entity is larger than
limits defined by server; the server might close the connection or return an
Retry-After header field.
414 URI Too Long
The URI requested by the client is
longer than the server is willing to interpret.
415 Unsupported Media Type
The media format of the requested
data is not supported by the server, so the server is rejecting the request.
416 Range Not Satisfiable
The range specified by the Range
header field in the request can't be fulfilled; it's possible that the range is
outside the size of the target URI's data.
417 Expectation Failed
This response code means the
expectation indicated by the Expect request header field can't be met by the
server.
418 I'm a teapot
The server refuses the attempt to
brew coffee with a teapot.
421 Misdirected Request
The request was directed at a
server that is not able to produce a response. This can be sent by a server
that is not configured to produce responses for the combination of scheme and
authority that are included in the request URI.
422 Unprocessable Entity (WebDAV)
The request was well-formed but
was unable to be followed due to semantic errors.
423 Locked (WebDAV)
The resource that is being
accessed is locked.
424 Failed Dependency (WebDAV)
The request failed due to failure
of a previous request.
425 Too Early
Indicates that the server is
unwilling to risk processing a request that might be replayed.
426 Upgrade Required
The server refuses to perform the
request using the current protocol but might be willing to do so after the
client upgrades to a different protocol. The server sends an Upgrade header in
a 426 response to indicate the required protocol(s).
428 Precondition Required
The origin server requires the
request to be conditional. This response is intended to prevent the 'lost
update' problem, where a client GETs a resource's state, modifies it, and PUTs
it back to the server, when meanwhile a third party has modified the state on
the server, leading to a conflict.
429 Too Many Requests
The user has sent too many
requests in a given amount of time ("rate limiting").
431 Request Header Fields Too
Large
The server is unwilling to process
the request because its header fields are too large. The request may be
resubmitted after reducing the size of the request header fields.
451 Unavailable For Legal Reasons
The user-agent requested a
resource that cannot legally be provided, such as a web page censored by a
government.
Server error responses
500 Internal Server Error
The server has encountered a
situation it doesn't know how to handle.
501 Not Implemented
The request method is not
supported by the server and cannot be handled. The only methods that servers
are required to support (and therefore that must not return this code) are GET
and HEAD.
502 Bad Gateway
This error response means that the
server, while working as a gateway to get a response needed to handle the
request, got an invalid response.
503 Service Unavailable
The server is not ready to handle
the request. Common causes are a server that is down for maintenance or that is
overloaded. Note that together with this response, a user-friendly page
explaining the problem should be sent. This responses should be used for
temporary conditions and the Retry-After: HTTP header should, if possible,
contain the estimated time before the recovery of the service. The webmaster
must also take care about the caching-related headers that are sent along with
this response, as these temporary condition responses should usually not be
cached.
504 Gateway Timeout
This error response is given when
the server is acting as a gateway and cannot get a response in time.
505 HTTP Version Not Supported
The HTTP version used in the
request is not supported by the server.
506 Variant Also Negotiates
The server has an internal
configuration error: the chosen variant resource is configured to engage in
transparent content negotiation itself, and is therefore not a proper end point
in the negotiation process.
507 Insufficient Storage (WebDAV)
The method could not be performed
on the resource because the server is unable to store the representation needed
to successfully complete the request.
508 Loop Detected (WebDAV)
The server detected an infinite
loop while processing the request.
510 Not Extended
Further extensions to the request
are required for the server to fulfill it.
511 Network Authentication
Required
The 511 status code indicates that
the client needs to authenticate to gain network access.
Comments
Post a Comment