Entropy in Few Minutes
Connecting various notions of entropies quickly
I’ve recently been collecting my thoughts on ‘entropy’, and was looking to explain it concisely. It is one of those non-trivial concepts that was introduced multiple times over a typical engineering education, but when asked to be defined, elicits several seemingly disjoint responses such as disorder, irreversibility and sometimes just a convex function. I’ll do my best in this article to explain, using an axiomatic approach, what entropy actually means and connect its various notions.
From the Basics
The first notion is from thermodynamics and it really helps to define it quickly using an axiomatic formulation [1]. This should be the preferred choice on how it should be taught in colleges according to me.
First, one can define “system” S as a set, “states” as variables associated with it and “process” p for each system that has an initial ⌊𝙥⌋ and final state ⌈𝙥⌉. These processes can also be “concatenated” as well (𝙥₁∘𝙥₂)
Then, one can define a “work” function W(p) that just gives a number for each process. This function is defined to be additive when you concatenate processes, or in other words, W(𝙥₁∘𝙥₂) = W(𝙥₁) + W(𝙥₂) and agrees with intuition. This leads naturally to the familiar assumption — the first law.
First Law: W just depends on the initial ⌊𝙥⌋ and final states⌈𝙥⌉!
The first law basically states that the work is not dependent on the process, but only on the states. Based on this, one can define “internal energy” U that equals the work to reach the current state from a reference state U₀. Note that this would be independent of the process as prescribed by the first law.
This also leads to a definition of “heat” added to a system S as
𝗤(𝙥) = ΔU(𝙥) − W(𝙥), with ΔU = U(⌈𝙥⌉) − U(⌊𝙥⌋)
Before postulating the second law, we need one more entity — the “reservoir”, which can be defined as a system where the state is prescribed by U and that you can’t extract work out of it, or simply put, Wᵣ(𝙥) ≥ 0. One more small thing to add is a “cyclic” process, which is nothing but one where the initial and final states are the same. This leads to the second law.
Second Law: If a system s, coupled to a reservoir r, undergoes a cyclic process, then W(𝙥) ≥ 0
This intuitively states that one cannot draw work out of a system connected to a reservoir. Using the above tools, one can come up with various scenarios that would lead to interesting results, one of which is Carnot’s theorem.
The proof for Carnot’s theorem is done via contradiction. First, it is shown that heat has to flow from one reservoir to the other. If not, one could create a contradiction by “completing the loop” on one reservoir and just extracting work from the other despite the combined (engine+latter reservoir) system being cyclic.
Once that is done, one can define “efficiency” to be the ratio of heat (magnitudes) flowing between reservoirs 𝗤ᵣ₁/𝗤ᵣ₂. After this, a similar proof-by-contradiction is used to show that one cannot have an engine more efficient than a reversible one and run together without violating the second law.
Thus, the reversible efficiency 𝜂(𝑟₁,𝑟₂) between two reservoirs is determined solely by their states. One can now have a reference reservoir, in similar lines to internal energy and define temperature as
T(r) = T₀ 𝜂(𝑟,𝑟₀).
Assume the cold reservoir is at a reference temperature. If one uses a reversible engine, we have Q/T = Q₀/T₀. The work that cannot be extracted even from a reversible Carnot engine Q₀ is thus proportional to the Q/T and indicates the amount of “disorder” that is inherent to a system — entropy.
Using the above postulates, it can also be shown that entropy increases from the initial to the final state for any irreversible process.
The reader is directed to Phil Kammerlander’s thesis [2] for more detailed explanations. Also, one figure that caught my attention from it was how this formulation differs slightly from the classical axiomatic approach of Caratheodory [3]
The above figure also shows how thermodynamics can have various formulations that eventually lead to similar concepts.
Connection to other entropies
The other common interpretation of entropy arises from statistical mechanics, where entropy is defined in terms of microstates.
Using some assumptions on particles in different energy levels, basic counting principles and approximations, the probability distribution for states in an ideal gas can be derived. This can then be used to evaluate macroscopic quantities such as the mean speed of an ideal gas. Together, they provide the value for the Boltzmann constant and thus, the connection to the macroscopic notion of entropy as well.
Shannon’s definition of entropy (𝔼[- log p(x)]) occurs in a different domain, but directly connects to Boltzmann entropy when all the microstates are equiprobable, or in other words, p(x) = 1/W, with W being the number of microstates, and becomes equal to log W.
Numerical entropy is a notion used in simulations and shares the same name as it is also expressible in terms of state variables and mimics some of the properties of thermodynamic entropy, such as being monotonic and increasing across irreversible processes in a domain such as shocks. They also connect to disorder in cases such as Total Variation Diminishing (TVD) schemes where more Total Variation corresponds to more “wiggles” in the solution or “disorder”.
Thus, various common notions of entropy such as macroscopic, statistical, information-theoretic and numerical are related to each other in these manners.
References
[1] Renner, R. (2021). An Axiomatic Framework for Thermodynamics, https://www.youtube.com/watch?v=dWtiRYxovpw
[2] Kammerlander, P. Tangible Phenomenological Thermodynamics, PhD Thesis.
[3] Carathéodory, C. (1909). Untersuchungen über die Grundlagen der Thermodynamik. Mathematische Annalen, 67(3), 355–386.