Wednesday, June 10, 2020

Notes and picks from "Range: Why Generalists Triumph in a Specialized World"

Notes and picks from book Range: Why Generalists Triumph in a Specialized World by David Epstein.

Specialized learning vs broad/wide learning

Epstein is against having 10,000-Hour Rule in a very high position. He starts with comparing Tiger Woods (who has been practicing golf heavily from early childhood) and Roger Federer (who dabbled in many different kinds of sports in his youth).

Epsteing makes a distinction between "kind" and "wicked" environments (reminds me of Cynefin framework:

  • "Kind" environments are environments with clear rules, cause-and-effect etc.
    • E.g. chess, firefighters, playing violin
    • In fields like this, 10,000-Hour rule is more relevant.
    • Studying often relates to patterns & repetitive structures
  • "Wicked" environments don't have so clear rules
    • Wider learning needed, very narrow expertise might event hurt the outcome

Too narrow knowledge

Epstein states that people are studying/taught too much deep separate branches of knowledge without getting a big picture.

Research of James Flynn is discussed (e.g. Flynn effect, increase in IQ test scores over the 20th century). According to Epstein, Flynn states that universities are teaching too much narrow specialization instead of giving breadth and critical thinking

“Even the best universities aren’t developing critical intelligence,” he said. “They aren’t giving students the tools to analyze the modern world, except in their area of specialization. Their education is too narrow.”

Scientific education does not automatically make us more critical or open-minded: Yale law & psychology professor Dan Kahan has shown that more scientifically-literate people are more likely to become dogmatic in politics-polarized subjects in science, see e.g. column Why we are poles apart on climate change.

Ospedale della Pietà is also discussed

  • A convent, orphanage and music school in Venice.
  • In the 1600s & 1700s it was famous for its all-female musical ensembles.
  • Epstein states that the students were learning many different instruments in their youth instead of focusing early in one instrument.

Analogies, potentially from distant domains, can be valuable when solving difficult problems.

Daniel Kahneman's Curriculum project was also referred - beware the "inside view".

Slow learning preferred

Eptein states that learning should not be fast. Struggle to retrieve information improves learning / moves knowledge to long term memory. Learning is improved by spacing, testing and making connections.

If you want it to stick, learning should be slow and hard, not quick and easy. The professors who received positive feedback had a net negative effect on their students in the long run. In contrast, those professors who received worse feedback actually inspired better student performance later on.

Focused "head start" or "early sampling"

Epstein discusses study & career paths - whether one should "be gritty with their chosen path" or change path if finding out that selected path is not optimal.

  • Epstein has concept of "match quality" - vision of the ideal career
  • "Winners quit fast and often" instead of "Quitters never win"
  • Knowing when to quit is important (though perseverance in difficult times is also important)
  • One's personality is not fixed
    • Personality changes by time, especially between 18 & late 20's -> early guess might result in low match quality.
    • Also, personality varies by context - Instead of asking who's gritty and who is not, ask who is gritty in which situation.

Some related quotes

We find who we are by living.

We discover (our) possibilities by doing, trying out new activities, building new networks, finding new role models.

An early sampling period is better than a focused head start.

Foxes, Birds, Hedgehogs and Frogs

There are parables related to deep vs broad knowledge and experience. These two were presented:

Foxes vs Hedgehogs

  • E.g. Essay by Isiah Berlin
  • Title is attributed to the Ancient Greek poet Archilochus quote: "a fox knows many things, but a hedgehog one important thing"
  • "Hedgehogs" would be people who view the through the lens of a single defining idea
  • "Foxes" would be people drawing on a wide variety of experiences and for whom the world cannot be boiled down to a single ideac

Birds vs frogs

Comes from Dyson Freeman essay). Deep vs broad thinking - both are needed

Birds fly high in the air and survey broad vistas of mathematics out to the far horizon.They delight in concepts that unify our thinking and bring together diverse problems from different parts of the landscape.

Frogs live in the mud below and see only the flowers that grow nearby. They delight in the details of particular objects, and they solve problems one at a time.

On decision-making and communication

Carter racing case study discussed (Related to Nasa Challenger launch disaster and decisions made there)

  • We don’t do good job asking “whether the data currently shown is all the data we need for making a decision or is there more data”
  • Reminds me of Kahneman's concept What You See Is All There Is

"Chain of command" and "Chain of communication" should be differentiated (information should flow in many directions).

Value of wide knowledge / range

Epstein tells an example of coronary stents & cardiologists: High specialization in one area causes one to see that one thing to be “the one” for any case (seeing only a couple of pieces of a huge jigsaw puzzle)

Quote from Oliver Smithies:

Take your skills to a place that's not doing the same sort of thing. Take your skills and apply them to a new problem, or take your problem and try completely new skills.

New knowledge combinations

To recap: work that builds bridges between disparate pieces of knowledge is less likely to be funded, less likely to appear in famous journals, more likely to be ignored upon publication, and then more likely in the long run to be a smash hit in the library of human knowledge. •

Related articles:

Advice for anyone: It's important to read “something outside your field”.

Final quotes

Compare yourself to yourself yesterday, not to younger people who aren’t you... you probably don’t even know where exactly you’re going, so feeling behind doesn’t help

 

So, about that, one sentence of advice: Don’t feel behind... research in myriad areas suggests that mental meandering and personal experimentation are sources of power, and head starts are overrated.

Thursday, June 4, 2020

Json Web Tokens (JWT)

This time I read JWT Handbook by Sebastián Peyrott that is available from Auth0 against giving your email.

JWT in general

What is JWT?

  • JWT stands for JSON Web Token.
  • A standard for safely passing claims in space constrained environments
  • Aims to be a simple, useful, standard container format that can optionally be also validated and/or encrypted.
  • Latest JWT spec: RFC 7519
  • Related specs

Example JWT

Example JWT from jwt.io (newlines added for readibility):

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.
eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.
SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

This JWT has three parts separated with a dot:

  • Header JSON (encoded with Base64Url)
{
  "alg": "HS256",
  "typ": "JWT"
}
  • Payload (also encoded with Base65Url)
{
  "sub": "1234567890",
  "name": "John Doe",
  "iat": 1516239022
}
  • Signature built on header, payload & a secret

Typical applications

client-side/stateless sessions

  • for storing client-side data
  • signature is typically used to validate the data
  • Can be potentially also encrypted

Security considerations

  • Signature Stripping
    • Removing the signature and changing the header to claim that the JWT is unsigned
    • -> Validation should not consider unsigned JWTs valid.
  • Cross-Site Request Forgery (CSRF)
    • Try to make the user browser to perform requests against a site where the user is logged in from a different site. (As
    • Relevant when session/JWT is in a cookie as cookies are sent by browser.
  • Cross-Site Scripting (XSS)
    • Attempt to inject JavaScript in trusted sites

Federated identity

OAuth 2.0 Access token & Refresh token as an example:

  • Access Token
    • Gives access to protected resources
    • Usually short-lived
    • Typically carries a signature (as signed JWT) -> can be validated by the resource servers
  • Refresh Token
    • Allows user to request new access tokens
    • Usually long-lived
    • Require access to the authorization server
  • OAuth 2.0 does not specify the format of tokens.
    • JWTs are a good match for access tokens.
    • OpenID Connect uses JWT to represent the ID token

JSON Web Token in Detail

  • Three elements: header, payload and signature/encryption data
  • Header & payload are JSON objects
  • Signature/encryption part depends on the algorithm used for signing or encryption. (In the case of unencrypted JWT it is omitted)
  • Compact serialization: Base64 URL-safe encoding of UTF-8 bytes of header & payload (JSON) and signing/encryption data (not JSON)
  • Also known as the JOSE header (JSON Object Signing and Encryption)
  • Claims about the JWT itself

Claims:

  • alg (Algorithm)
    • Algorithm used for signing and/or encrypting the JWT
    • Only mandatory claim for an unencrypted JWT
  • cty (Content Type)
    • In the typical case of specific claims and arbitrary data, this must not be set.
    • Must be "JWT" when payload is another JWT itself (nested JWT)
  • typ (Media type)
    • relevant only in cases when JWTs could be mixed with other objects carrying a JOSE header (which rarely happens)

Payload

  • No mandatory claims
  • Registered claims have specific meaning

Registered Claims:

  • iss (Issuer)
    • A case-sensitive string or URI uniquely
    • Identifying JWT issuer
    • Application-specific interpretation
  • sub (Subject)
    • A case-sensitive string or URI
    • Identifying the party that this JWT carries information about.
    • JWT claims are about this party.
    • Application-specific handling
  • aud (Audience)
    • Either a single case-sensitive string or URI or an array of such values
    • Identifying intended recipients
    • Application-specific interpretation
  • exp (Expiration (time))
  • nbf (Not before (time))
    • "Opposite of exp claim"
  • iat (Issued At (time))
  • jti (JWT ID)
    • Can be used to differentiate JWTs with others

Other claims are private or public

JSON Web Signatures (JWS)

  • The book states JWS as "probably the single most useful feature of JWTs"
  • Allow to establish the authenticity of the JWT (validation)
  • Note: Does not prevent other parties from reading the contents inside the JWT

Algorithms

Specified in RFC 7518, JSON Web Algorithms (JWA)

Keyed-Hash Message Authentication Code (HMAC) is an algorithm that produces a code (hash) from a certain payload with a secret using a cryptographic hash function.

One algorithm is required to be supported by all JWS conforming implementations:

  • "HS256"
    • HMAC using SHA-256 hash function (shared secret scheme)

These are recommended:

  • "RS256"
    • RSASSA PKCS1 v1.5 using SHA-256
    • RSASSA is a variation of asymmetric RSA algorithm used for signatures.
      • Private key can be created to create signature (and to verify it)
      • Public key can only be used to verify the signature (and thus authenticity of the message)
  • "ES256"
    • ECDSA using P-256 and SHA-256
    • Uses an alternative to RSA, Elliptic Curve Digital Signature Algorithm (ECDSA)
    • Also an algorithm with public and private keys but different mathematics.

Optional ones (that are in practice variations of required and recommended ones):

  • "HS384" & "HS512": Variations of "HS256" with SHA-384 and SHA-512
  • "RS384" & "RS512": Variations of "RS256" with SHA-384 and SHA-512
  • "ES384" & "ES512": Variations of "ES256" with SHA-384 and SHA-512.
  • "PS256", "PS384" & "PS512": RSASSA-PSS + MGF1 with SHA-256/SHA-384/SHA-512

JWS Header Claims

See section 4.1. of RFC 7515

Serializations

JWS spec defines two types of serialization:

  • JWS Compact Serialization
    • The typical JWT serialization
    • baseurl-encoded header, payload and signature separated with dots
    • Single signature
  • JWS JSON Serialization, with two alternatives
    • General syntax that supports multiple signatures
    • Flattened syntax (a single signature)

For more details, see section 7 of RFC 7515.

JSON Web Encryption (JWE)

When JWS makes it possible to validate data, JWE makes it possible to prevent third parties from reading the data.

As in JWS, two schemes:

  • a shared secret scheme - A party that holds the shared secret can encrypt and decrypt data
  • a public/private key scheme
    • A party that holds the public key can encrypt data.
    • A party that holds the private key can decrypt data.
    • NOTE: Anyone holding the public key can encrypt new data
      • Thus JWE does not replace role of JWS in token exchange
      • JWE and JWS are complementary when using a public/private key scheme.
    • Encrypted JWTs are sometimes nested: An encrypted JWT serves as a container for a signed JWT

Structure of an encrypted JWT

Encrypted JWT compact representation has 5 elements (instead of 3 in signed and unsecured JTWs)

  1. The protected header - As JWS header
  2. The encrypted key - A symmetric key used to encrypt the ciphertext & other encrypted data
  • Note that the ciphertext is encrypted in a symmetric way even if an asymmetric algorithm is used to encrypt the key.
  1. The initialization vector - Needed for some encryption algorithms
  2. The encrypted data (ciphertext)
  3. The authentication tag - Can be used validate the ciphertext
  • Note that this doesn't remove the need for nested JWTs

Key Encryption Algorithms

Key Encryption Algorithms ("alg" header) recommended to be implemented:

  • RSA variants:
    • "RSA1_5" - RSAES-PKCS1-v1_5 (NOTE: marked for removal of the recommendation)
    • "RSA-OAEP" - RSAES-OAEP with defaults (marked to be required in the future)
  • AES variants
    • "A128KW" - AES-128 Key Wrap
    • "A256KW" - AES-256 Key Wrap
  • Elliptic Curve variants:
    • "ECDH-ES" - Elliptic Curve Diffie-Hellman Ephemeral Static (ECDH-ES) using Concat KDF (marked to be required in the future)
  • Combinations
    • "ECDH-ES+A128KW" - ECDH-ES using Concat KDF and CEK wrapped with AES-128
    • "ECDH-ES+A256KW" - ECDH-ES using Concat KDF and CEK wrapped with AES-256

Key Management Modes

JWE spec defines a couple of different Key Management Modes related to determining the Content Encryption Key (CEK)

  • Key Encryption - CEK is encrypted for the intended recipient using an asymmetric encryption algorithm
  • Key Wrapping - CEK is encrypted for the intended recipient using a symmetric encryption algorithm
  • Direct Key Agreement - a key agreement algorithm is used to agree upon the CEK value.
  • Key Agreement with Key Wrapping - a key agreement algorithm is used to agree upon a symmetric key used to encrypt the CEK value to the intended recipient using a symmetric key wrapping algorithm.
  • Direct Encryption - shared symmetric key is used as the CEK

It's important to note that CEK and JWE encryption key are different things

  • CEK is the key used to encrypt/decrypt the actual data payload
  • JWE encryption key is used to encrypt or compute the CEK (unless Direct Encryption is used)

Required Content Encryption Algorithms ("enc" header):

  • AES CBC + HMAC SHA - AES 128/256 with Cipher Block Chaining and HMAC + SHA-256/512 for validation.
    • "A128CBC-HS256" - AES_128_CBC_HMAC_SHA_256
    • "A256CBC-HS512" - AES_256_CBC_HMAC_SHA_512

JWE Header Claims

See section 4.1. of RFC 7516

JSON Web Keys (JWK)

  • Different representations for the keys used for signatures and encryption
  • Aiming for a unified presentation of all keys supported in the JWA spec.

An example JWK from RFC 7517:

{
  "kty": "EC",    // Key type: Elliptic Curve
  "crv": "P-256", // Curve type: P-256
  "x": "f83OJ3D2xF1Bg8vub9tLe1gHMzV76e8Tus9uPHvRVEU", // base64-encoded x & y coordinates
  "y": "x_FEzRu9m36HLN_tue659LNpXW6pCyStikYjKIWI5a0", // (Parameters for elliptic curves)
  "kid": "Public key used in JWS spec Appendix A.3 example" // Key identifier
}

Common parameters (more details in section 4 of RFC 7517:

  • kty (Key Type) - "EC" / "RSA" / "oct" (symmetric keys)
  • use (Public Key Use) - "sig" (signature) / "enc" (encryption)
  • key_ops (Key Operations)
    • an array of strings specifying detailed uses for the key
    • Potential values "sign", "verify", "encrypt", "decrypt", "wrapKey", "unwrapKey", "deriveKey", "deriveBits"
  • alg (Algorithm) - the algorithm intended for use with the key
  • kid (Key ID) - A unique identifier for this key.
  • x5u (X.509 URL) - A URL pointing to a X.509 public key certificate or certificate chain in PEM encoded form
  • x5c (X.509 Certificate Chain) - Base64-URL encoded X.509 DER public key certificate or certificate chain
  • x5t (X.509 Certificate SHA-1 Thumbprint) - Base-64-URL encoded SHA-1 thumbprint/fingerprint of the DER encoding of a X.509 certificate
  • x5t#S256 (X.509 Certificate SHA-256 Thumbprint) - As x5t, but with SHA-256 thumbprint/fingerprint.
  • Other parameters specific to the key algorithm. e.g. x, y, d, n, e etc.

JSON Web Key Sets (aka JWK Sets)

  • Carry more than one key
  • Meaning of the order of the keys is user-defined
  • A JSON object with "keys" field consisting of a JSON array of JWKs

JSON Web Algorithms

In this chapter, the algorithms used in earlier chapters are discussed in more detail.

Base64

Base64's is a binary-to-text encoding used widely with JWT, JWS and JWE. With JTW & related specs, a URL-safe variant of Base64 is used. For more details, see e.g. RFC 4648

Secure Hash Algorithm (SHA)

  • SHA used in JWT is defined in FIPS-180, see also RFC 4634.
  • Note: Not to be confused with SHA-1 (deprecated, should not be used)
    • FIPS-180 SHA is sometimes called SHA-2
  • For JWT, SHA-256 & SHA512 are of interest.
  • Roughly:
    • Input is processed in fixed-side chunks
    • For each chunk, perform a bunch of mathematical operations
    • Result is accumulated with previous chunk results
    • After all chunks, digest is computed.
  • For code example, see sha256.js

Hash-based Message Authentication Code (HMAC)

  • Use a cryptographic hash function (e.g. SHA family) and a key to create an authentication code.
  • Takes a hash function, a message and a secret key as input
  • Produces an authentication code (HMAC) as output

Definition from RFC 2104:

To compute HMAC over the data `text' we perform

H(K XOR opad, H(K XOR ipad, text))

  • ipad = the byte 0x36 repeated B times
  • opad = the byte 0x5C repeated B times

So, e.g. "HS256" (HMAC + SHA256) means HMAC using SHA-256 as the hash function,

RSA

  • "RSA" stands for the initials of it's developers Ron Rivest, Adi Shamir and Leonard Adleman.
  • Has variations both for signing and encryption
  • Stands on integer factorization being computationally relative extensize operation to perform.

The RSA "basic expression": (m^e)^d = m (mod n) where

  • It is computationally feasible to find very large integers e, d and n that satisfy the equation.
  • It is relatively difficult to find d when other numbers are known.
  • Public key is composed of values n and e
  • Private key is composed of values n and d

More details can be found from e.g. The Public Key Cryptography Standard #1 (PKCS #1) (RFC 3447).

Signing with RSA

Signing:

  • Produce a message digest from the message
  • Raise digest to the power of d mod n
  • Attach the result as signature

Verifying signature:

  • Raise signature to the power of e mod n
  • Produce a message digest from the message
  • If the results from previous steps match, the signature is valid

JWT "RS256" signature algorithm is PKCS#1 RSASSA v1.5 using SHA-256.

Elliptic Curve (EC)

Elliptic Curves is a different field of mathematics that provides a "one-way function" that is easy to compute but hard to invert (elliptic curve discrete logarithm problem).

Some math resources:

Elliptic Curve Digital Signature Algorithm (ECDSA)

  • Curves and algorithms defined in FIPS 186-4 + other associated standards.
    • JWA uses three curves: P-256, P-384, and P-521.
  • Within certain curve used as a "base point" G for EC operations:
    • Private key can be constructed by picking a random number between 1 and n (order of base point G)
    • Public key can be computed with multiplying private key with the base point G

"ES256" is ECDSA using elliptic curve P-256 and SHA-256 hash.

Best practices

Based on RFC 8725.

Common pitfalls / attacks

  • alg: none
    • setting header "alg" to "none" and modifying payload
  • Using RS256 Public-key as HS256 secret
    • as public key is often public
  • Weak HMAC keys
    • If using a HMAC key of "typical password length", brute force attack might be possible
  • Wrong stacked encryption + signature verification assumptions
    • Wrong assumption that encryption would provide also protection against tampering
    • Esp. non-standard encryption algorithms might not have data integrity verification
    • Nested JTWs: Failing to validate innermost JWT when encrypted JWT is carrying a signed JWT
  • Invalid Elliptic curve attacks
  • Substitution attacks
    • Sending a token intended for recipient A to recipient B (if both verify the token with the same public key)

Mitigations

Friday, May 1, 2020

Reactive Programming with RxJava

This time I read Reactive Programming with RxJava by Ben Christensen and Tomasz Nurkiewicz. Various notes will follow (mainly new things / things I want to write specially down)

Some basics / How RxJava works (Ch. 1)

The contract of an RxJava Observable:

  • Events onNext(), onCompleted() and onError()can never be emitted concurrently.
  • E.g. onNext() can't be invoked in a thread if it is still being executed on another thread.
  • Why? Mainly because concurrency is difficult.
  • See also full Rx Observable contract.

The Observable type is lazy, meaning it does nothing until it is subscribed to.

  • Subscription, not construction starts work.
  • Observables can be reused.

Observable duality with Iterable:

Pull (Iterable) Push (Observable)
T next onNext(T)
throws Exception onError(Throwable)
returns onCompleted

Cardinalities:

One Many
Synchronous T getData() Iterable getData()
Asynchronous Future getData() Observable getData()

Concurrency vs Parallelism

  • Parallelism is when multiple tasks are simultaneously executed e.g. on different CPUs or machines.
  • Concurrency is composition and/or interleaving of multiple tasks, that can be done even with a machine with a single core.

Reactive Extensions (Ch. 2)

Observable is a push-based stream of events.

Subscribing to notifications from Observable. Remember the contract: Even if events are emitted from many threads, callbacks will not be invoked from more than one thread at a time.

Observable<Foo> foos = ...

foos.subscribe(
    (Foo foo) -> { System.out.println(foo); },
    (Throwable t) -> { t.printStackTrace(); },
    () -> System.out.println("No more");
)

Example for Observable.create:

Observable<Integer> ints = Observable.
    create(new Observable.OnSubscribe<Integer>() {
        @Override
        public void call(Subscriber<? super Integer> subscriber) {
            log("Create");
            subscriber.onNext(5);
            subscriber.onNext(6);
            subscriber.onCompleted();
            log("Completed");
        }
    });
log("Start");
ints.subscribe(i -> log("Element" + i));
log("End");

produces

main: Start
main: Create
main: Element: 5
main: Element: 6
main: Completed
main: End

By default, emitting does not begin until we actually subscribe. Every time subscribe() is called, handler is invoked again.

In some cases (e.g. DB query or heavy computation) this might not be wanted -> cache() operator. NOTE: cache() and infinite stream is a recipe for disaster.

Operators wrap existing Observables, enhancing them typically by intercepting subscription.

Hot and cold observables

  • Cold Observable is completely lazy, begins emit events only when subscribed to.
    • Without observers, nothing happens.
    • Each subscriber receives its own copy of the stream.
  • Hot Observables are independent of consumers.
    • Hot Observable might emit events even without Subscribers.
    • Typically when we have no control over the source of events, e.g. UI actions.
    • Note that you can't be sure whether you have received all events from the beginning.

Subject & ConnectableObservable

Subject both extends Observable and implements Observer at the same time. Some examples (JavaDocs: RxJava 1 rx.subjects, RxJava 2 io.reactivex.subjects)

  • PublishSubject
  • AsyncSubject
  • BehaviorSubject
  • ReplaySubject

See also ConnectableObservable (RxJava 1, RxJava 2)

Operators and Transformations (Ch. 3)

An operator is a function that takes upstream Observable<T> and returns downstream Observable<R> (T and R might or might not be the same).

Operators are often viewed as /marble diagrams/, see e.g. Observable javadocs (See e.g. Observable.map() - RxJava 1 / RxJava 2). See also interactive marble diagrams at https://rxmarbles.com/.

There are many operators for various purposes (not going to list them here).

Custom operators

  • In case of a transformation from an Observable to another Observable, there is an interface Observable.Transformer that can be given to Observable.compose. (This would usually be a series of operators)
  • In case existing operators are not enough, there is Observable.lift.
    • When compose() transforms Observables, lift() transforms Subscribers.
    • Note: Backpressure & subscription mechanism need to be taken into account.
    • Note: Operators are usually operators for lift(), e.g. observable.filter(predicate) is implemented as observable.lift(new OperatorFilter<>(predicate))

Applying Reactive Programming to Existing Applications (Ch. 4)

Notes on Multithreading in RxJava

Even though not enforced or suggested by the type system, many Observables are asynchronous from the very beginning, and you should assume that.

  • A blocking subscribe() method happens very rarely, when a lambda within Observable.create() is not backed by any asynchronous process or stream.
  • However, by default (with create()) everything happens in the client thread.

Scheduler

  • RxJava is concurrency-agnostic - does not introduce concurrency on its own.
  • Certain operators cannot work property without concurrency.
  • RxJava has Scheduler (similar to ScheduledExecutorService)
  • Used together with subscribeOn() and observeOn() as well as when creating certain types of Observables.
  • Naively, a Scheduler can be thinked to be like a thread pool and a Worker like a thread inside that pool.

Built-in schedulers:

  • Schedulers.newThread()
    • a new thread each time
    • "hardly ever a good choice"
  • Schedulers.io()
    • similar to newThread() but recycles threads and can possibly handle future requests
    • for I/O bound tasks, waiting for network/disk
  • Schedulers.computation()
    • for entirely CPU-bound tasks
    • by default number of threads executed in parallel is limited by value of availableProcessors()
  • Schedulers.from(Executor executor)
  • Schedulers.immediate()
    • Invokes a task within the client thread in a blocking fashion
  • Schedulers.trampoline()
    • Similar to immediate() but executes a task when all previously scheduled tasks complete.
  • Schedulers.test()
    • For testing purposes
    • Ability to arbitrarily advance the clock

subscribeOn & observeOn

  • What happens if there are two subscribeOn() invocations between Observable and subscribe()?
    • subscribe() closest to the original Observable wins.
  • Note: In entirely reactive software stacks, subscribeOn() is almost never used, yet all Observables are asynchronous.
    • In general, subscribeOn() is used quite rarely, mostly when retrofitting existing APIs or libraries.
  • In general, flatMap() and merge() are the operators for parallelism

Simple analogies:

  • Observable without any Scheduler works like a single-threaded program with blocking method calls passing data between one another.
  • Observable with a single subscribeOn() is like starting a big task in the background Thread. The program within that Thread is still sequential, but at least it runs in the background.
  • Observable using flatMap() where each internal Observable has subscribeOn() works like java.util.concurrent.ForkJoinPool , where each substream is a fork of execution and flatMap() is a safe join merge.

observeOn():

  • Everything below observeOn() is run within the supplied Scheduler.
  • subscribeOn() & observeOn() work well together when you want to physically decouple producer (Observable.create()) and consumer (Subscriber) .

Note that many operators use some Scheduler by default, typically Schedulers.computation().

Reactive from Top to Bottom (Ch. 5)

Non-blocking applications tend to provide great performance and throughput for a fraction of the hardware. By limiting the number of threads, we're able to fully utilize CPU without consuming gigabytes of memory.

C10k problem:

The advice for interacting with relational databases is to actually have a dedicated, well-tuned thread pool and isolate the blocking code there.

Short introduction to CompletableFuture. Semantically, you can treat CompletableFuture like an Observable with the following characteristics:

  • It is hot.
  • It is cached.
  • It emits exactly one element or exception.

Flow Control and Backpressure (Ch. 6)

RxJava has two ways of dealing with producers being more active than subscribers:

  • Various flow-control mechanisms such as sampling and batching (built-in operators)
  • Subscribers can propagate their demand and request only as many items as they can process - by using a feedback channel known as backpressure.

Flow control operators

  • sample() & throttleFirst()
  • buffer() (different overloaded versions) - extremely flexible, quite complex.
  • window() (different overloaded versions)
  • debounce()

Backpressure

  • In essence, backpressure is a feedback channel from the consumer to producer.
  • a protocol that allows the consumer to request how much data it can consume at a time.
  • "consumer" here: terminal subscribers as well as intermediate operators
  • Switches from push to pull-push model
  • When creating Observables, think about correctly handling the backpressure requests.

Testing and troubleshooting (Ch. 7)

According to the Reactive Manifesto, reactive systems should be responsive, resilient, elastic and message-driven. Here responsiveness and resiliency are discussed. Observable is a container for values or errors.

Case Studies (Ch. 8)

A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable

  • Leslie Lamport, 1987

Managing Failures with Hystrix

UPDATE NOTE: Nowadays Hystrix is in maintenance mode, the most prominent successor being resilience4j.

RxJava has many operators that support writing scalable, reactive and resilient applications

  • Declarative concurrency with Schedulers.
  • Timeouts and various error handling mechanisms.
  • Parallelizing work with flatMap and limiting concurrency at the same time.

Hystrix helps to work with actions that can potentially fail, applying clever logic around such code. Mechanisms include:

  • Bulkhead patterns by cutting off misbehaving actions entirely for a certain time.
    • Bulkheads are large walls across a ship's hull that create watertight compartments.
  • Failing fast by applying timeouts, limiting concurrency, and implementing a so-called circuit breaker:
    • Circuit breaker's responsibility is to interrupt flow of electricity in order to protect various devices from overload and even catching fire.
  • Batching requests by collapsing small orders into one big order.
    • Note that batching makes no sense under low load.
  • Collecting, publishing, and visualizing performance statistics.

Hystrix helps with self-healing in two fronts

  • By turning off broken commands temporarily, it allows downstream dependencies to recover.
  • After recovering, the system returns back to normal operation.

Other ways of expressing asynchronous computation

Note that there are also other ways of expressing asynchronous computation, for example:

  • Java 8 CompletableFuture & CompletionStage
  • Parallel Stream (see note below)
  • Flux and Mono from Project Reactor
  • Guava ListenableFuture

Note on parallel Java 8 Streams: All parallel streams share a hardcoded ForkJoinPool aligned with the number of CPUs -> Works generally fine with CPU-intensive tasks but does not work well for I/O intensive tasks.

Be aware of operators consuming uncontrolled amounts of memory.

  • distinct() caching all seen events
  • Buffering events with toList() and buffer()
  • Caching with cache() and ReplaySubject

Note: Backpressure helps to keep memory usage low by failing fast.

Future Directions (Ch. 9)

  • Reactive Streams API
  • RxJava 2 (discussed in the end of the book)
    • E.g.: Separate io.reactivex.Observable (non-backpressured) and io.reactivex.Flowable (backpressure-enabled)
  • RxJava 3 (released after book)

Further updates after the book:

Saturday, February 1, 2020

Classification of typefaces based on historical periods

Here's another blog post / collection of notes based on The Elements of Typographic Style by Robert Bringhurst. The book is definitely recommended reading if you're interested in typography. Bringhurst has an interesting approach of viewing typefaces based on historical eras.

But letterforms are not only objects of science, they also belong to the realm of art and they participate in history. They have changed over time just as music, painting or architecture have changed and the same historical terms—Renaissance, Baroque, Neoclassical, Romantic and so on—are useful in each of these fields.

I've gathered some quotes as well as links to example typefaces. Unfortunately, the typography of this blog does not match the book :/

Early roman inscriptions

The examination starts from Greek & Roman inscriptions. Example modern typefaces based on these:

  • Lithos (1989, based on Greek letterforms)
  • Trajan (1989, based on letterforms of Roman square capitals).

Renaissance Roman and Italic Letter

It is interesting that names "roman" and "italic" are used for "upright" vs "cursive" letterforms. "Roman" of roman letter is related to the background with letterforms from Roman Empire. "Italic" type evolved in Italy during the Renaissance.

Bringhurst's words on the history of "Roman" and "Italic":

Roman type consists of two quite different basic parts. The upper case, which does indeed come from Rome, is based on Roman imperial inscriptions. The lower case was developed in northern Europe, chiefly in France and Germany, in the late Middle Ages, and given its final polish in Venice in the early Renaissance.

Italic letterforms, on the other hand, are an Italian Renaissance creation. Some early italics come from Rome, others from elsewhere in Italy ...

For more details, see Development of roman type and Italic type from Wikipedia.

Renaissance Roman Letter

Bringhurst's description for Renaissance roman letter:

Like Roman inscriptional capitals, Renaissance roman lowercase letters have a modulated stroke (the width varies with direction) and a humanist axis. This means that the letters have the form produced by a broadnib pen held in the right hand in a comfortable and relaxed writing position. The thick strokes run SW/SE, the axis of the writer's hand and forearm. The serifs are crips, the stroke is light, and the contrast between thick strokes and thin strokes is generally modest.

Examples based on Renaissance Roman letter forms:

Renaissance Italic Letter

Some characteristics of the Renaissance italic letter by Bringhurst

  • stems vertical or of fairly even slope, not exceeding 10°
  • bowls generally elliptical
  • light, modulated stroke
  • consistent humanist axis
  • ...

Examples based on Renaissance Italic letter forms:

Baroque

Bringhurst:

Baroque typography, like Baroque painting and music, is rich with activity and takes delight in the restless and dramatic play of contradictory forms. One of the most obvious features of any Baroque typeface is the large variation in axis from one letter to the next. Baroque italics are ambidextrous: both right- and lefthanded. And it was during the Baroque that typographers first made a habit of mixing roman and italic on the same line.

Examples:

Rococo

The historical periods listed here ... are naturally not limited, in typography, to roman and italic letters. Blackletter and script types passed through the same phases as well. The Rococo period, with its love of florid ornament, belongs almost entirely to blackletters and scripts.

The Neoclassical Letter

Generally speaking, Neoclassical art is more static and restrained than either Renaissance or Baroque art, and far more interested in rigorous consistency. Neoclassical letterforms follow this pattern. In Neoclassical letters, an echo of the broadnib pen can still be seen, but it is rotated away from the natural writing angle to a strictly vertical or rationalist axis. The letters are moderate in contrast and aperture, but their axis is dictated by an idea, not by the truth of human anatomy. They are products of the Rationalist era: frequently beautiful, calm forms, but forms oblivious to the more complex beauty of organic fact. If Baroque letterforms are ambidextrous, Neoclassical letters are, in their quiet way, neitherhanded

Examples:

The Romantic Letter

Neoclassicism and Romanticism are not sequential movements in European history. They marched through the eighteenth century and much of the nineteenth side by side: vigorously opposed in some respects and closely united in others. Both Neoclassical and Romantic letterforms adhere to a rationalist axis, and both look more drawn than written, but it is possible to make some precise distinctions between the two. The most obvious difference is one in contrast.

In Romantic letters we will normally find the following:

  • abrupt modulation of the stroke
  • vertical axis intensified through exaggerated contrast
  • hardening of terminals from lachrymal to round
  • serifs thinner and more abrupt
  • aperture reduced

Examples:

  • Bulmer (around 1790)
  • Didot (a group of typefaces, developed in 1784–1811)

The Realist Letter

Realist type designers - Alexander Phemister, Robert Besley and others, who have not achieved the posthumous fame of the painters - worked in a similar spirit. They made blunt and simple letters, based on the script of people denied the opportunity to learn to read and write with fluency and poise. Realist letters very often have the same basic shape as Neoclassical and Romantic letters, but most of them have heavy, slab serifs or no serifs at all. The stroke is uniform in weight, and the aperture (often a gauge of grace or good fortune in typefaces) is tiny. Small caps, text figures and other signs of sophistication and elegance are almost always missing.

Examples:

Geometric modernism

Early modernism took many intriguing typographic forms. One of the most obvious is geometric. The sparest, most rigorous architecture of the early twentieth century has its counterpart in the equally geometric typefaces designed at the same time, often by the same people. These typefaces, like their Realist predecessors, make no distinction between main stroke and serif. Their serifs are equal in weight with the main strokes or are missing altogether. But most Geometric Modernist faces seek purity more than populism. Some show the study of archaic inscriptions, and some include text figures and other subtleties, but their shapes owe more to pure mathematical forms - the circle and the line - than to scribal letters.

Examples:

Lyrical Modernism

Another major phase of modernism in type design is closely allied with abstract expressionist painting. Painters in the twentieth century rediscovered the physical and sensory pleasures or painting as an act, and the pleasures of making organic instead of mechanical forms. Designers of type during those years were equally busy rediscovering the pleasures of writing letterforms rather than drawing them. In rediscovering calligraphy, they rediscovered the broadnib pen, the humanist axis and humanist scale of Renaissance letters. Typographic modernism is fundamentally the reassertion of Renaissance form. There is no hard lin between modernist design and Renaissance revival.

Examples:

The Expressionist Letter

In yet another of its aspects, typographic modernism is rough and concrete more than lyrical and abstract. ... typographic counterparts of expressionist painters ...

Examples:

Elegiac Postmodernism

... In the last decades of the twentieth century, critics of architecture, literature and music - along with others who study human affairs - all perceived movements away from modernism. Lacking any proper name of their own, these movements have come to be called by the single term postmodernism. ... Postmodern letterforms, like Postmodern buildings, frequently recycle and revise Neoclassical, Romantic and other premodern forms. At their best, they do so with an engaging lightness of touch and a fine sense of humor.

Examples:

Geometric Postmodernism

Some Postmodern faces are highly geometric. Like their predecessors the Geometric Modernist faces, they are usually slab-serifed or unserifed, but often they exist in both varieties at once or are hybrids of the two.

Examples:

Other classifications

For the interested, links to some other resources on typeface classification:

Wednesday, January 29, 2020

Quotes from The Elements of Typographic style

During the Christmas holidays, I had time to read an interesting classic, The Elements of Typographic Style by Robert Bringhurst. The book kind of was on my to read list since Hack Design typography lessons. The author's love of typography is strongly present in the book, as well as his background as a poet and an author. I collected some fascinating quotes from the book.

First, some thoughts on purpose of typography:

... typography should perform these services for the reader:

  • invite the reader into the text
  • reveal the tenor and meaning of the text
  • clarify the structure and order of the text
  • link the text with other existing elements
  • induce a state of energetic repose, which is the ideal condition for reading.

On a well made book

In a badly designed book, the letters mill and stand like starving horses in a field. In a book designed by rote, they sit like stale bread and mutton on the page. In a well-made book, where designer, compositor and printer have all done their jobs, no matter how many thousands of lines and pages, the letters are alive. They dance in their seats. Sometimes they rise and dance in the margins and aisles.

Quality or quantity?

With type as with philosophy, music and food, it is better to have a little of the best than to be swamped with the derivative, the careless, the routine.

Good typography is like bread

Logotypes and logograms push typography in the direction of hieroglyphics, which tend to be looked at rather than read. They also push it toward the realm of candy and drugs, which tend to provoke dependent responses, and away from the realm of food, which tends to promote autonomous being. Good typography is like bread: ready to be admired, appraised, and dissected before it is consumed.

Thread metaphor

An ancient metaphor: thought is a thread, and the raconteur is a spinner of yarns — but the true storyteller, the poet, is a weaver. The scribes made this old and audible abstraction into a new and visible fact. After long practice, their work took on such an even, flexible texture that they called the written page a textus, which means cloth.

Motivation for text figures

It is true that text figures are rarely useful in classified ads, but they are useful for setting almost everything else, including good magazine copy and newspaper copy. They are basic parts of typographic speech, and they are a sign of civilization: a sign that dollars are not really twice as important as ideas, and numbers are not afraid to consort on an equal footing with words.

Shaping the page

A book is a flexible mirror of the mind and the body. Its overall size and proportions, the color and texture of the paper, the sound it makes as the pages turn, and the smell of the paper, adhesive and ink, all blend with the size and form and placement of the type to reveal a little about the world in which it was made. If the book appears to be only a paper machine, produced at their own convenience by other machines, only machines will want to read it.

On craftsmanship and instinct

Instinct, in matters like these, is largely memory in disguise. It works quite well when it is trained, and porky otherwise. But in a craft like typography, no matter how perfect one's instincts are, it is useful to be able to calculate answers exactly. History, natural science, geometry and mathematics are all relevant to typography in this regard - and can all be counted on for aid.

By the way, this reminds me of the discussion on intuition in the book Thinking - fast and slow.

Bringhurst doesn't value reading literature from a screen

The screen mimics the sky, not the earth. It bombards the eye with light instead of waiting to repay the gift of vision. It is not simultaneously restful and lively, like a field full of flowers, or the face of a thinking human being, or a well-made typographic page. And we read the screen the way we read the sky: in quick sweeps, guessing at the weather from the changing shapes of clouds, or like astronomers, in magnified small bits, examining details. We look to it for clues and revelations more than wisdom. This makes it an attractive place for advertising and dogmatizing, but not so good a place for thoughtful text.

Whether typography is more engineering or art

The rate of change in typesetting methods has been steep — perhaps it has approximated the Fibonacci series — for more than a century. Yet, like poetry and painting, storytelling and weaving, typography itself has not improved. There is no greater proof that typography is more art than engineering. Like all the arts, it is basically immune to progress, though it is not immune to change.

Friday, January 17, 2020

Notes & quotes from "Transforming Nokia"

I finished an interesting book by Risto Siilasmaa: Transforming NOKIA: The Power of Paranoid Optimism to Lead Through Colossal Change. (Technically, I listened to the Finnish version, "Paranoidi optimisti - Näin johdin Nokiaa murroksessa"). The main story of the book follows the history of Nokia since 2008 when Siilasmaa joined Nokia's board of directors. In addition to Nokia's story during that time, Siilasmaa discusses his thoughts on leadership. I'm gonna take some picks & quotes related to these leadership ideas.

Paranoid optimism & scenario mapping

Paranoid optimism is a way of thinking through (in a "paranoid way") different things & scenarios that could go wrong and finding out how to either prevent them or prepare for them. This allows optimism that balances the paranoia. Book website has article How Paranoid Optimism Helped Nokia Survive Apple and Android on the topic.

Golden rules for boards

Siilasmaa shares Golden rules for boards used in Nokia. The main ideas are mostly relevant also in other team-working contexts. List here shortened from a related tweet

  • Assume the best of intentions in the actions of others.
  • Data-driven philosophy based on analysis.
  • Be well educated in the company's success and deeply engaged in the discussions with the management
  • Be prepared for a passionate debate, but do it in an informed and respectful way.
  • Firm and respectful challenge of the management
  • We seek to constantly improve in everything we do.
  • A board meeting where we do not laugh out aloud is a miserable failure.

Some quotes

"No news is bad news. Bad news is good news. And good news is no news." (Source tweet)

"Trust is the glue that binds us together and the oil that removes friction and enables motion." (Source tweet)