If you are going to use cryptography in the browser, there’s a good chance you will want to deal with X.509 certificates. This post is going to get started by using the Web Cryptography API to do two operations on certificates:
- Import a public key from an X.509 certificate
- Verify the certificate authority (CA) signature on an X.509 certificate
To keep it simple, the example used will be a root CA certificate because those are self-signed. That means we won’t need to get the public key out of one certificate (the CA’s) and use that to verify the signature on another certificate. Instead, we use the same certificate for both. But the concepts and code in this post would be the same for a non-root certificate.
Note: really validating a certificate signature generally requires validating the signatures on a chain of certificates, not just one: the certificate in question, the certificate used to sign it, the certificate used to sign that one, and so on, until you reach the self-signed root certificate. The root certificate has to be verified “out of band” because the chain has to end sometime. Again, this uses the same ideas as here, just repeated as often as needed.
The example will reference the latest current draft of the Web Cryptography API, and RFC 5280, the IETF definition of X.509 certificates. The self-signed root certificate we will use is the VeriSign Universal Root Certification Authority certificate, because VeriSign is a major CA that intends to begin using this as the basis of issued certificates and this certificate should be available on almost any PC with a web browser. Just in case it isn’t, you can get a copy directly from the CA.
Instead of creating a test web page with sample code, all the examples in this post will be made from the browser’s developer console. Google Chrome 38 was used to write it, but it should work in any browser that supports the Web Cryptography API.
Certificate Formats
If you look at the downloaded certificate in a text editor, you should see text starting with:
-----BEGIN CERTIFICATE-----
MIIEuTCCA6GgAwIBAgIQQBrEZCGzEyEDDrvkEhrFHTANBgkqhkiG9w0BAQsFADCB
and ending with:
7M2CYfE45k+XmCpajQ==
-----END CERTIFICATE-----
This is known as PEM format (for Privacy-enhanced Electronic Mail). PEM files can have one or more cryptographic objects of various types in them, delineated with BEGIN and END lines. In between the lines is a base 64 encoded version of the object in a closer-to-native format. For the CERTIFICATE type, that is a base-64 encoding of the certificate in ASN.1 DER format. The last post went into a lot of detail about this encoding, and included a JavaScript function, berToJavaScript, to convert from that binary format to a JavaScript object.
So the first step is going to be base-64 decoding the certificate into a byte array (Uint8Array) for further processing. Get the base-64 encoded part of the PEM file (the stuff between the BEGIN and END lines) into a JavaScript string encoded. It’s easier if you first put it in a text editor and concatenate it into one line. Then just do a simple variable assignment:
var encoded = "
MIIEuTCCA6 (rest of string here) XmCpajQ==";
Decode that string using the standard window.atob function, and then copy the bytes in the resulting binary string to a byte array:
var decoded = window.atob(encoded);
var der = new Uint8Array(decoded.length);
for (var i=0; i<decoded.length; i++) {
der[i] = decoded.charCodeAt(i);
}
Parsing BER/DER
Next comes the hard part: parsing the encoded object. The last post has a function called berToJavaScript that takes a byte array and returns a JavaScript object describing the BER or DER encoded object at the start of the byte array. Copy and paste that function in the browser’s JavaScript console. It returns a JavaScript object with the following fields:
- cls – the ASN.1 class of the object. An object’s class controls how the tag field (below) is interpreted. Objects of class 0 (Universal class) have standard tags, others depend on the application or context. The possible values are integers 0, 1, 2, or 3.
- tag – the ASN.1 object type. The standard tags can be found in this Wikipedia article. The values are non-negative integers.
- structured – a boolean specifying whether this is a structured object or not. Structured objects’ values are themselves encoded ASN.1 objects, one after another. Other objects are called primitive, and interpreting their values depends on the tag and class.
- byteLength – the number of bytes the entire encoded object takes.
- contents – a byte array containing the object’s value.
- raw – the original byte array being parsed.
So we can parse the DER encoded object with:
var parsedCert = berToJavaScript(der);
The returned value is:
{
cls: 0,
tag: 16,
structured: true,
byteLength: 1213,
contents: Uint8Array[1209]..., // Actual values skipped here
raw: Uint8Array[1213]... // Again, values are skipped
}
So this is Universal class (0), which means the tag value of 16 has a standard meaning: SEQUENCE or SEQUENCE OF. It’s structured (which makes sense for a sequence). Actually parsing the certificate enough to extract the public key and verify the signature on it requires going deeper into the contents.
Parsing an X.509 Certificate
RFC 5280 defines the logical structure of the certificate. It starts with the basic certificate fields:
Certificate ::= SEQUENCE {
tbsCertificate TBSCertificate,
signatureAlgorithm AlgorithmIdentifier,
signatureValue BIT STRING }
Based on our berToJavaScript result for the entire data structure, we can see we are on the right track. It had tag 16 of universal class, which matches the SEQUENCE shown here. Here’s a simple parser for a certificate:
function parseCertificate(byteArray) {
var asn1 = berToJavaScript(byteArray);
if (asn1.cls !== 0 || asn1.tag !== 16 || !asn1.structured) {
throw new Error("This can't be an X.509 certificate. Wrong data type.");
}
var cert = {asn1: asn1}; // Include the raw parser result for debugging
var pieces = berListToJavaScript(asn1.contents);
if (pieces.length !== 3) {
throw new Error("Certificate contains more than the three specified children.");
}
cert.tbsCertificate = parseTBSCertificate(pieces[0]);
cert.signatureAlgorithm = parseSignatureAlgorithm(pieces[1]);
cert.signatureValue = parseSignatureValue(pieces[2]);
return cert;
}
The berListToJavaScript function reference above is pretty simple: start parsing at the beginning of the array, then at the first byte following the first result, and so on, until the byte array is consumed, returning an array containing each object:
function berListToJavaScript(byteArray) {
var result = new Array();
var nextPosition = 0;
while (nextPosition < byteArray.length) {
var nextPiece = berToJavaScript(byteArray.subarray(nextPosition));
result.push(nextPiece);
nextPosition += nextPiece.byteLength;
}
return result;
}
Now we have to write three parsers, one for each logical part. Since we’ve already converted the encoded objects to JavaScript objects, we will use them as input to the parsers. We will do that in reverse order, because that puts the simplest parsers first. The RFC says that the SignatureValue is a BIT STRING, so…
function parseSignatureValue(asn1) {
if (asn1.cls !== 0 || asn1.tag !== 3 || asn1.structured) {
throw new Error("Bad signature value. Not a BIT STRING.");
}
var sig = {asn1: asn1}; // Useful for debugging
sig.bits = berBitStringValue(asn1.contents);
return sig;
}
BIT STRING is a standard type that is pretty simple to parse. The contents consist of an initial byte giving the number of bits to ignore, then a byte array containing all the bits. For example, the bit string 10110
is five bits long, so any byte containing it has three more bits than the actual bit string. So this is encoded in hex as 03 b0
. That means ignore the last three bits in the byte string b0
(which is 10110000
in binary).
function berBitStringValue(byteArray) {
return {
unusedBits: byteArray[0],
bytes: byteArray.subarray(1)
};
}
What about the SignatureAlgorithm? The RFC says it is an AlgorithmIdentifier, which is:
AlgorithmIdentifier ::= SEQUENCE {
algorithm OBJECT IDENTIFIER,
parameters ANY DEFINED BY algorithm OPTIONAL }
Which leads to:
var parseSignatureAlgorithm = parseAlgorithmIdentifier;
function parseAlgorithmIdentifier(asn1) {
if (asn1.cls !== 0 || asn1.tag !== 16 || !asn1.structured) {
throw new Error("Bad algorithm identifier. Not a SEQUENCE.");
}
var alg = {asn1: asn1};
var pieces = berListToJavaScript(asn1.contents);
if (pieces.length > 2) {
throw new Error("Bad algorithm identifier. Contains too many child objects.");
}
var encodedAlgorithm = pieces[0];
if (encodedAlgorithm.cls !== 0 || encodedAlgorithm.tag !== 6 || encodedAlgorithm.structured) {
throw new Error("Bad algorithm identifier. Does not begin with an OBJECT IDENTIFIER.");
}
alg.algorithm = berObjectIdentifierValue(encodedAlgorithm.contents);
if (pieces.length === 2) {
alg.parameters = {asn1: pieces[1]}; // Don't need this now, so not parsing it
} else {
alg.parameters = null; // It is optional
}
return alg;
}
The parameters vary according to the algorithm. The RSA algorithms we will be working with don’t use parameters, so this skips really parsing them fully.
Object Identifiers (OIDs) are a standard type. Their values are essentially sequences of non-negative integers representing different kinds of objects. A common way of writing them is as a list of integers with periods between them. The first two integers are taken from the first byte: the integer division of the first byte by 40, and the remainder of that division. A common first byte is 2a
(hex) which is decimal 42, giving initial two integers as 1.2.
The remaining integers are represented as lists of bytes, where all the bytes except the last ones have leading 1 bits. The leading bits are dropped and the remaining bits interpreted as a binary integer. A common hex value is 86 f7 0d
. The first two bytes have leading 1 bits and the third does not, so we first convert the hex to binary (10000110 11110111 00001101
), then drop the leading bits (0000110 1110111 0001101
) and converting that to decimal (113549).
The list of integers is interpreted as a hierarchical tree. The first integer is the master organization for the OID, which can assign the second integer values to member organizations, and so on. For example, the OID for an RSA signature with SHA-1 (a very common one) is 1.2.840.113549.1.1.5. How do you figure these out? Similar to the way DNS lookups work. The leading 1 means this is controlled by the International Standards Organization (ISO), so look up how ISO assigns the next integer. That 2 means ISO has assigned it to a member body. The 840 is the United States, which assigned 113549 to RSA Data Security, Inc. (RSADSI). RSA created the various standards we are using, they assigned 1 to their PKCS family of specifications, which assigned 1 to the PKCS-1 specification, which assigned 5 to the algorithm RSA encryption with SHA-1 hashing.
Or you can just Google 1.2.840.113549.1.1.5 to find a helpful web site that interprets these for you. Based on what we’ve already seen in the API, we will only be supporting that option and 1.2.840.113549.1.1.11 (RSA with SHA-256) for now.
Here’s the code for getting OBJECT IDENTIFIERs:
function berObjectIdentifierValue(byteArray) {
var oid = Math.floor(byteArray[0] / 40) + "." + byteArray[0] % 40;
var position = 1;
while(position < byteArray.length) {
var nextInteger = 0;
while (byteArray[position] >= 0x80) {
nextInteger = nextInteger * 0x80 + (byteArray[position] & 0x7f);
position += 1;
}
nextInteger = nextInteger * 0x80 + byteArray[position];
position += 1;
oid += "." + nextInteger;
}
return oid;
}
Now for the biggest piece, the TBSCertificate. The RFC gives the definition:
TBSCertificate ::= SEQUENCE {
version [0] EXPLICIT Version DEFAULT v1,
serialNumber CertificateSerialNumber,
signature AlgorithmIdentifier,
issuer Name,
validity Validity,
subject Name,
subjectPublicKeyInfo SubjectPublicKeyInfo,
issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL,
subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL,
extensions [3] EXPLICIT Extensions OPTIONAL
}
We can continue down the path as before, defining parsers for each piece, but right now we only care about the subjectPublicKeyInfo (because we want to import that into a CryptoKey object to verify a signature) and signature (which is actually the algorithm used for the signature, which we need to know to verify a signature). So we won’t extend the parsing of any of the other pieces here:
function parseTBSCertificate(asn1) {
if (asn1.cls !== 0 || asn1.tag !== 16 || !asn1.structured) {
throw new Error("This can't be a TBSCertificate. Wrong data type.");
}
var tbs = {asn1: asn1}; // Include the raw parser result for debugging
var pieces = berListToJavaScript(asn1.contents);
if (pieces.length < 7) {
throw new Error("Bad TBS Certificate. There are fewer than the seven required children.");
}
tbs.version = pieces[0];
tbs.serialNumber = pieces[1];
tbs.signature = parseAlgorithmIdentifier(pieces[2]);
tbs.issuer = pieces[3];
tbs.validity = pieces[4];
tbs.subject = pieces[5];
tbs.subjectPublicKeyInfo = parseSubjectPublicKeyInfo(pieces[6]);
return tbs; // Ignore optional fields for now
}
We’re almost there. Parsing the SubjectPublicKeyInfo is easy with what’s already been written. The RFC says its structure is:
SubjectPublicKeyInfo ::= SEQUENCE {
algorithm AlgorithmIdentifier,
subjectPublicKey BIT STRING }
So the parser is just:
function parseSubjectPublicKeyInfo(asn1) {
if (asn1.cls !== 0 || asn1.tag !== 16 || !asn1.structured) {
throw new Error("Bad SPKI. Not a SEQUENCE.");
}
var spki = {asn1: asn1};
var pieces = berListToJavaScript(asn1.contents);
if (pieces.length !== 2) {
throw new Error("Bad SubjectPublicKeyInfo. Wrong number of child objects.");
}
spki.algorithm = parseAlgorithmIdentifier(pieces[0]);
spki.bits = berBitStringValue(pieces[1].contents);
return spki;
}
Okay, the parser is putting in enough pieces to do the cryptography we want. We parse the certificate with:
var certificate = parseCertificate(der);
If all the code above is right, and was correctly pasted into the console, this should return an object representing the certificate.
Cryptography
The first thing we need to do is get the public key into a CryptoKey object. After all this build-up, it turns out to be quite easy.
var publicKey;
var alg = certificate.tbsCertificate.signature.algorithm;
if (alg !== "1.2.840.113549.1.1.5" && alg !== "1.2.840.113549.1.1.11") {
throw new Error("Signature algorithm " + alg + " is not supported yet.");
}
var hashName = "SHA-1";
if (alg === "1.2.840.113549.1.1.11") {
hashName = "SHA-256";
}
window.crypto.subtle.importKey(
'spki',
certificate.tbsCertificate.subjectPublicKeyInfo.asn1.raw,
{name: "RSASSA-PKCS1-v1_5", hash: {name: hashName}},
true,
["verify"]
).
then(function(key) {
publicKey = key;
}).
catch(function(err) {
alert("Import failed: " + err.message);
});
Run this in the JavaScript console and then examine publicKey. It should show as a CryptoKey.
Verifying the signature is just as easy:
window.crypto.subtle.verify(
{name: "RSASSA-PKCS1-v1_5", hash: {name: hashName}},
publicKey,
certificate.signatureValue.bits.bytes,
certificate.tbsCertificate.asn1.raw
).
then(function(verified) {
if (verified) {
alert("The certificate is properly self-signed.");
} else {
alert("The self-signed certificate's signature is not valid.");
}
}).
catch(function(err) {
alert("Error verifying signature: " + err.message);
});
And that is that.
[…] has been quite a while since my last post. I got painted into a corner trying to import key pairs that were from the Windows 7 certificate […]
Pingback by Deriving Keys from Passwords with WebCrypto | Charles Engelke's Blog — February 14, 2015 @ 1:02 pm
Back again with the WebCryptoAPI
If you usually follow the blog you already know about the Web Cryptography API, it is a new standard JavaScript API to perform cryptography operations inside the browser. I felt depressed the last time I tried to play with it, because it does not cover…
Trackback by Ricky's Hodgepodge — April 1, 2015 @ 10:51 am