Wednesday, June 10, 2020

Notes and picks from "Range: Why Generalists Triumph in a Specialized World"

Notes and picks from book Range: Why Generalists Triumph in a Specialized World by David Epstein.

Specialized learning vs broad/wide learning

Epstein is against having 10,000-Hour Rule in a very high position. He starts with comparing Tiger Woods (who has been practicing golf heavily from early childhood) and Roger Federer (who dabbled in many different kinds of sports in his youth).

Epsteing makes a distinction between "kind" and "wicked" environments (reminds me of Cynefin framework:

  • "Kind" environments are environments with clear rules, cause-and-effect etc.
    • E.g. chess, firefighters, playing violin
    • In fields like this, 10,000-Hour rule is more relevant.
    • Studying often relates to patterns & repetitive structures
  • "Wicked" environments don't have so clear rules
    • Wider learning needed, very narrow expertise might event hurt the outcome

Too narrow knowledge

Epstein states that people are studying/taught too much deep separate branches of knowledge without getting a big picture.

Research of James Flynn is discussed (e.g. Flynn effect, increase in IQ test scores over the 20th century). According to Epstein, Flynn states that universities are teaching too much narrow specialization instead of giving breadth and critical thinking

“Even the best universities aren’t developing critical intelligence,” he said. “They aren’t giving students the tools to analyze the modern world, except in their area of specialization. Their education is too narrow.”

Scientific education does not automatically make us more critical or open-minded: Yale law & psychology professor Dan Kahan has shown that more scientifically-literate people are more likely to become dogmatic in politics-polarized subjects in science, see e.g. column Why we are poles apart on climate change.

Ospedale della Pietà is also discussed

  • A convent, orphanage and music school in Venice.
  • In the 1600s & 1700s it was famous for its all-female musical ensembles.
  • Epstein states that the students were learning many different instruments in their youth instead of focusing early in one instrument.

Analogies, potentially from distant domains, can be valuable when solving difficult problems.

Daniel Kahneman's Curriculum project was also referred - beware the "inside view".

Slow learning preferred

Eptein states that learning should not be fast. Struggle to retrieve information improves learning / moves knowledge to long term memory. Learning is improved by spacing, testing and making connections.

If you want it to stick, learning should be slow and hard, not quick and easy. The professors who received positive feedback had a net negative effect on their students in the long run. In contrast, those professors who received worse feedback actually inspired better student performance later on.

Focused "head start" or "early sampling"

Epstein discusses study & career paths - whether one should "be gritty with their chosen path" or change path if finding out that selected path is not optimal.

  • Epstein has concept of "match quality" - vision of the ideal career
  • "Winners quit fast and often" instead of "Quitters never win"
  • Knowing when to quit is important (though perseverance in difficult times is also important)
  • One's personality is not fixed
    • Personality changes by time, especially between 18 & late 20's -> early guess might result in low match quality.
    • Also, personality varies by context - Instead of asking who's gritty and who is not, ask who is gritty in which situation.

Some related quotes

We find who we are by living.

We discover (our) possibilities by doing, trying out new activities, building new networks, finding new role models.

An early sampling period is better than a focused head start.

Foxes, Birds, Hedgehogs and Frogs

There are parables related to deep vs broad knowledge and experience. These two were presented:

Foxes vs Hedgehogs

  • E.g. Essay by Isiah Berlin
  • Title is attributed to the Ancient Greek poet Archilochus quote: "a fox knows many things, but a hedgehog one important thing"
  • "Hedgehogs" would be people who view the through the lens of a single defining idea
  • "Foxes" would be people drawing on a wide variety of experiences and for whom the world cannot be boiled down to a single ideac

Birds vs frogs

Comes from Dyson Freeman essay). Deep vs broad thinking - both are needed

Birds fly high in the air and survey broad vistas of mathematics out to the far horizon.They delight in concepts that unify our thinking and bring together diverse problems from different parts of the landscape.

Frogs live in the mud below and see only the flowers that grow nearby. They delight in the details of particular objects, and they solve problems one at a time.

On decision-making and communication

Carter racing case study discussed (Related to Nasa Challenger launch disaster and decisions made there)

  • We don’t do good job asking “whether the data currently shown is all the data we need for making a decision or is there more data”
  • Reminds me of Kahneman's concept What You See Is All There Is

"Chain of command" and "Chain of communication" should be differentiated (information should flow in many directions).

Value of wide knowledge / range

Epstein tells an example of coronary stents & cardiologists: High specialization in one area causes one to see that one thing to be “the one” for any case (seeing only a couple of pieces of a huge jigsaw puzzle)

Quote from Oliver Smithies:

Take your skills to a place that's not doing the same sort of thing. Take your skills and apply them to a new problem, or take your problem and try completely new skills.

New knowledge combinations

To recap: work that builds bridges between disparate pieces of knowledge is less likely to be funded, less likely to appear in famous journals, more likely to be ignored upon publication, and then more likely in the long run to be a smash hit in the library of human knowledge. •

Related articles:

Advice for anyone: It's important to read “something outside your field”.

Final quotes

Compare yourself to yourself yesterday, not to younger people who aren’t you... you probably don’t even know where exactly you’re going, so feeling behind doesn’t help

 

So, about that, one sentence of advice: Don’t feel behind... research in myriad areas suggests that mental meandering and personal experimentation are sources of power, and head starts are overrated.

Thursday, June 4, 2020

Json Web Tokens (JWT)

This time I read JWT Handbook by Sebastián Peyrott that is available from Auth0 against giving your email.

JWT in general

What is JWT?

  • JWT stands for JSON Web Token.
  • A standard for safely passing claims in space constrained environments
  • Aims to be a simple, useful, standard container format that can optionally be also validated and/or encrypted.
  • Latest JWT spec: RFC 7519
  • Related specs

Example JWT

Example JWT from jwt.io (newlines added for readibility):

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.
eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.
SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

This JWT has three parts separated with a dot:

  • Header JSON (encoded with Base64Url)
{
  "alg": "HS256",
  "typ": "JWT"
}
  • Payload (also encoded with Base65Url)
{
  "sub": "1234567890",
  "name": "John Doe",
  "iat": 1516239022
}
  • Signature built on header, payload & a secret

Typical applications

client-side/stateless sessions

  • for storing client-side data
  • signature is typically used to validate the data
  • Can be potentially also encrypted

Security considerations

  • Signature Stripping
    • Removing the signature and changing the header to claim that the JWT is unsigned
    • -> Validation should not consider unsigned JWTs valid.
  • Cross-Site Request Forgery (CSRF)
    • Try to make the user browser to perform requests against a site where the user is logged in from a different site. (As
    • Relevant when session/JWT is in a cookie as cookies are sent by browser.
  • Cross-Site Scripting (XSS)
    • Attempt to inject JavaScript in trusted sites

Federated identity

OAuth 2.0 Access token & Refresh token as an example:

  • Access Token
    • Gives access to protected resources
    • Usually short-lived
    • Typically carries a signature (as signed JWT) -> can be validated by the resource servers
  • Refresh Token
    • Allows user to request new access tokens
    • Usually long-lived
    • Require access to the authorization server
  • OAuth 2.0 does not specify the format of tokens.
    • JWTs are a good match for access tokens.
    • OpenID Connect uses JWT to represent the ID token

JSON Web Token in Detail

  • Three elements: header, payload and signature/encryption data
  • Header & payload are JSON objects
  • Signature/encryption part depends on the algorithm used for signing or encryption. (In the case of unencrypted JWT it is omitted)
  • Compact serialization: Base64 URL-safe encoding of UTF-8 bytes of header & payload (JSON) and signing/encryption data (not JSON)
  • Also known as the JOSE header (JSON Object Signing and Encryption)
  • Claims about the JWT itself

Claims:

  • alg (Algorithm)
    • Algorithm used for signing and/or encrypting the JWT
    • Only mandatory claim for an unencrypted JWT
  • cty (Content Type)
    • In the typical case of specific claims and arbitrary data, this must not be set.
    • Must be "JWT" when payload is another JWT itself (nested JWT)
  • typ (Media type)
    • relevant only in cases when JWTs could be mixed with other objects carrying a JOSE header (which rarely happens)

Payload

  • No mandatory claims
  • Registered claims have specific meaning

Registered Claims:

  • iss (Issuer)
    • A case-sensitive string or URI uniquely
    • Identifying JWT issuer
    • Application-specific interpretation
  • sub (Subject)
    • A case-sensitive string or URI
    • Identifying the party that this JWT carries information about.
    • JWT claims are about this party.
    • Application-specific handling
  • aud (Audience)
    • Either a single case-sensitive string or URI or an array of such values
    • Identifying intended recipients
    • Application-specific interpretation
  • exp (Expiration (time))
  • nbf (Not before (time))
    • "Opposite of exp claim"
  • iat (Issued At (time))
  • jti (JWT ID)
    • Can be used to differentiate JWTs with others

Other claims are private or public

JSON Web Signatures (JWS)

  • The book states JWS as "probably the single most useful feature of JWTs"
  • Allow to establish the authenticity of the JWT (validation)
  • Note: Does not prevent other parties from reading the contents inside the JWT

Algorithms

Specified in RFC 7518, JSON Web Algorithms (JWA)

Keyed-Hash Message Authentication Code (HMAC) is an algorithm that produces a code (hash) from a certain payload with a secret using a cryptographic hash function.

One algorithm is required to be supported by all JWS conforming implementations:

  • "HS256"
    • HMAC using SHA-256 hash function (shared secret scheme)

These are recommended:

  • "RS256"
    • RSASSA PKCS1 v1.5 using SHA-256
    • RSASSA is a variation of asymmetric RSA algorithm used for signatures.
      • Private key can be created to create signature (and to verify it)
      • Public key can only be used to verify the signature (and thus authenticity of the message)
  • "ES256"
    • ECDSA using P-256 and SHA-256
    • Uses an alternative to RSA, Elliptic Curve Digital Signature Algorithm (ECDSA)
    • Also an algorithm with public and private keys but different mathematics.

Optional ones (that are in practice variations of required and recommended ones):

  • "HS384" & "HS512": Variations of "HS256" with SHA-384 and SHA-512
  • "RS384" & "RS512": Variations of "RS256" with SHA-384 and SHA-512
  • "ES384" & "ES512": Variations of "ES256" with SHA-384 and SHA-512.
  • "PS256", "PS384" & "PS512": RSASSA-PSS + MGF1 with SHA-256/SHA-384/SHA-512

JWS Header Claims

See section 4.1. of RFC 7515

Serializations

JWS spec defines two types of serialization:

  • JWS Compact Serialization
    • The typical JWT serialization
    • baseurl-encoded header, payload and signature separated with dots
    • Single signature
  • JWS JSON Serialization, with two alternatives
    • General syntax that supports multiple signatures
    • Flattened syntax (a single signature)

For more details, see section 7 of RFC 7515.

JSON Web Encryption (JWE)

When JWS makes it possible to validate data, JWE makes it possible to prevent third parties from reading the data.

As in JWS, two schemes:

  • a shared secret scheme - A party that holds the shared secret can encrypt and decrypt data
  • a public/private key scheme
    • A party that holds the public key can encrypt data.
    • A party that holds the private key can decrypt data.
    • NOTE: Anyone holding the public key can encrypt new data
      • Thus JWE does not replace role of JWS in token exchange
      • JWE and JWS are complementary when using a public/private key scheme.
    • Encrypted JWTs are sometimes nested: An encrypted JWT serves as a container for a signed JWT

Structure of an encrypted JWT

Encrypted JWT compact representation has 5 elements (instead of 3 in signed and unsecured JTWs)

  1. The protected header - As JWS header
  2. The encrypted key - A symmetric key used to encrypt the ciphertext & other encrypted data
  • Note that the ciphertext is encrypted in a symmetric way even if an asymmetric algorithm is used to encrypt the key.
  1. The initialization vector - Needed for some encryption algorithms
  2. The encrypted data (ciphertext)
  3. The authentication tag - Can be used validate the ciphertext
  • Note that this doesn't remove the need for nested JWTs

Key Encryption Algorithms

Key Encryption Algorithms ("alg" header) recommended to be implemented:

  • RSA variants:
    • "RSA1_5" - RSAES-PKCS1-v1_5 (NOTE: marked for removal of the recommendation)
    • "RSA-OAEP" - RSAES-OAEP with defaults (marked to be required in the future)
  • AES variants
    • "A128KW" - AES-128 Key Wrap
    • "A256KW" - AES-256 Key Wrap
  • Elliptic Curve variants:
    • "ECDH-ES" - Elliptic Curve Diffie-Hellman Ephemeral Static (ECDH-ES) using Concat KDF (marked to be required in the future)
  • Combinations
    • "ECDH-ES+A128KW" - ECDH-ES using Concat KDF and CEK wrapped with AES-128
    • "ECDH-ES+A256KW" - ECDH-ES using Concat KDF and CEK wrapped with AES-256

Key Management Modes

JWE spec defines a couple of different Key Management Modes related to determining the Content Encryption Key (CEK)

  • Key Encryption - CEK is encrypted for the intended recipient using an asymmetric encryption algorithm
  • Key Wrapping - CEK is encrypted for the intended recipient using a symmetric encryption algorithm
  • Direct Key Agreement - a key agreement algorithm is used to agree upon the CEK value.
  • Key Agreement with Key Wrapping - a key agreement algorithm is used to agree upon a symmetric key used to encrypt the CEK value to the intended recipient using a symmetric key wrapping algorithm.
  • Direct Encryption - shared symmetric key is used as the CEK

It's important to note that CEK and JWE encryption key are different things

  • CEK is the key used to encrypt/decrypt the actual data payload
  • JWE encryption key is used to encrypt or compute the CEK (unless Direct Encryption is used)

Required Content Encryption Algorithms ("enc" header):

  • AES CBC + HMAC SHA - AES 128/256 with Cipher Block Chaining and HMAC + SHA-256/512 for validation.
    • "A128CBC-HS256" - AES_128_CBC_HMAC_SHA_256
    • "A256CBC-HS512" - AES_256_CBC_HMAC_SHA_512

JWE Header Claims

See section 4.1. of RFC 7516

JSON Web Keys (JWK)

  • Different representations for the keys used for signatures and encryption
  • Aiming for a unified presentation of all keys supported in the JWA spec.

An example JWK from RFC 7517:

{
  "kty": "EC",    // Key type: Elliptic Curve
  "crv": "P-256", // Curve type: P-256
  "x": "f83OJ3D2xF1Bg8vub9tLe1gHMzV76e8Tus9uPHvRVEU", // base64-encoded x & y coordinates
  "y": "x_FEzRu9m36HLN_tue659LNpXW6pCyStikYjKIWI5a0", // (Parameters for elliptic curves)
  "kid": "Public key used in JWS spec Appendix A.3 example" // Key identifier
}

Common parameters (more details in section 4 of RFC 7517:

  • kty (Key Type) - "EC" / "RSA" / "oct" (symmetric keys)
  • use (Public Key Use) - "sig" (signature) / "enc" (encryption)
  • key_ops (Key Operations)
    • an array of strings specifying detailed uses for the key
    • Potential values "sign", "verify", "encrypt", "decrypt", "wrapKey", "unwrapKey", "deriveKey", "deriveBits"
  • alg (Algorithm) - the algorithm intended for use with the key
  • kid (Key ID) - A unique identifier for this key.
  • x5u (X.509 URL) - A URL pointing to a X.509 public key certificate or certificate chain in PEM encoded form
  • x5c (X.509 Certificate Chain) - Base64-URL encoded X.509 DER public key certificate or certificate chain
  • x5t (X.509 Certificate SHA-1 Thumbprint) - Base-64-URL encoded SHA-1 thumbprint/fingerprint of the DER encoding of a X.509 certificate
  • x5t#S256 (X.509 Certificate SHA-256 Thumbprint) - As x5t, but with SHA-256 thumbprint/fingerprint.
  • Other parameters specific to the key algorithm. e.g. x, y, d, n, e etc.

JSON Web Key Sets (aka JWK Sets)

  • Carry more than one key
  • Meaning of the order of the keys is user-defined
  • A JSON object with "keys" field consisting of a JSON array of JWKs

JSON Web Algorithms

In this chapter, the algorithms used in earlier chapters are discussed in more detail.

Base64

Base64's is a binary-to-text encoding used widely with JWT, JWS and JWE. With JTW & related specs, a URL-safe variant of Base64 is used. For more details, see e.g. RFC 4648

Secure Hash Algorithm (SHA)

  • SHA used in JWT is defined in FIPS-180, see also RFC 4634.
  • Note: Not to be confused with SHA-1 (deprecated, should not be used)
    • FIPS-180 SHA is sometimes called SHA-2
  • For JWT, SHA-256 & SHA512 are of interest.
  • Roughly:
    • Input is processed in fixed-side chunks
    • For each chunk, perform a bunch of mathematical operations
    • Result is accumulated with previous chunk results
    • After all chunks, digest is computed.
  • For code example, see sha256.js

Hash-based Message Authentication Code (HMAC)

  • Use a cryptographic hash function (e.g. SHA family) and a key to create an authentication code.
  • Takes a hash function, a message and a secret key as input
  • Produces an authentication code (HMAC) as output

Definition from RFC 2104:

To compute HMAC over the data `text' we perform

H(K XOR opad, H(K XOR ipad, text))

  • ipad = the byte 0x36 repeated B times
  • opad = the byte 0x5C repeated B times

So, e.g. "HS256" (HMAC + SHA256) means HMAC using SHA-256 as the hash function,

RSA

  • "RSA" stands for the initials of it's developers Ron Rivest, Adi Shamir and Leonard Adleman.
  • Has variations both for signing and encryption
  • Stands on integer factorization being computationally relative extensize operation to perform.

The RSA "basic expression": (m^e)^d = m (mod n) where

  • It is computationally feasible to find very large integers e, d and n that satisfy the equation.
  • It is relatively difficult to find d when other numbers are known.
  • Public key is composed of values n and e
  • Private key is composed of values n and d

More details can be found from e.g. The Public Key Cryptography Standard #1 (PKCS #1) (RFC 3447).

Signing with RSA

Signing:

  • Produce a message digest from the message
  • Raise digest to the power of d mod n
  • Attach the result as signature

Verifying signature:

  • Raise signature to the power of e mod n
  • Produce a message digest from the message
  • If the results from previous steps match, the signature is valid

JWT "RS256" signature algorithm is PKCS#1 RSASSA v1.5 using SHA-256.

Elliptic Curve (EC)

Elliptic Curves is a different field of mathematics that provides a "one-way function" that is easy to compute but hard to invert (elliptic curve discrete logarithm problem).

Some math resources:

Elliptic Curve Digital Signature Algorithm (ECDSA)

  • Curves and algorithms defined in FIPS 186-4 + other associated standards.
    • JWA uses three curves: P-256, P-384, and P-521.
  • Within certain curve used as a "base point" G for EC operations:
    • Private key can be constructed by picking a random number between 1 and n (order of base point G)
    • Public key can be computed with multiplying private key with the base point G

"ES256" is ECDSA using elliptic curve P-256 and SHA-256 hash.

Best practices

Based on RFC 8725.

Common pitfalls / attacks

  • alg: none
    • setting header "alg" to "none" and modifying payload
  • Using RS256 Public-key as HS256 secret
    • as public key is often public
  • Weak HMAC keys
    • If using a HMAC key of "typical password length", brute force attack might be possible
  • Wrong stacked encryption + signature verification assumptions
    • Wrong assumption that encryption would provide also protection against tampering
    • Esp. non-standard encryption algorithms might not have data integrity verification
    • Nested JTWs: Failing to validate innermost JWT when encrypted JWT is carrying a signed JWT
  • Invalid Elliptic curve attacks
  • Substitution attacks
    • Sending a token intended for recipient A to recipient B (if both verify the token with the same public key)

Mitigations