Answering a question at stack exchange.
At a given temperature, stored bits have a precise theoretical minimal potential energy. Bits being erased (which occurs with every change of state in a CPU gate that has 2 inputs but only 1 output) releases heat energy. For a given mass, kinetic energy is not fundamentally different from potential energy due to relativity. Heat is made up of kinetic energy of a myriad of interacting particles going in different directions such that only so much of their kinetic energy can be converted to potential energy, so the disorganized kinetic energy of heat energy is fundamentally different from potential energy.
Entropy is related to heat by
Entropy is related to potential energy by S = (U + pV - G)/T where U = internal energy, p=pressure, V=volume, and G is Gibb's free energy. G is the amount of reversible energy in a system that can do work. U and p can increase if heat is added. G is potential energy energy that can do work. That's the kind of potential energy we are interested in, and relation to entropy if the other variables i a given system are kept constant is:
The connection between physical entropy and information is:
bits = S / ln(2) / k.
where S is physical entropy and k is Boltzmann's constant. This is the number of yes/no questions (the number of bits in a message) that have to be asked to determine what state a physical system is in.
This comes from Landauer's principle says at least kT ln(2) energy must be lost as heat when a bit is erased. At least k ln(2) entropy is generated when 1 bit is erased. The ln(2) is the conversion from log base 2 units (bits) to natural log units (nats). Boltzmann's constant k is the conversion from temperature energy units (average RMS energy per particle) to Joules of energy. So it is fundamentally unitless. The apparent units of k is why many are skeptical that physical entropy is fundamentally the same as information entropy differing by only the conversion constant k ln(2).
Information uses potential energy and stores potential energy when you store a bit. It is converted to kinetic energy that is released as heat when you erase a bit. Erasing a bit is the same as "computing" (a NAND gate which can be used to make a Turing machine takes two inputs and has one output, erasing 1 of the inputs with each subsequent computation). If reversible computing is done (assuming it has not theoretical problems such as dissipation), then the bit can be go back and forth between potential energy and kinetic energy like a pendulum.
To make entropy "equal" to an energy, there has to be a change in the entropy that corresponds to a change in energy at some defined temperature via dQ = T dS. Temperature seems to be something like noise in a communication channel where you have to use more energy "shout" the message louder in order to overcome the noise.
There are some complications in connecting information to entropy at the microscopic level. Shannon's informational entropy H is a specific entropy (aka molar entropy) that has units of bits per symbol of a message source. So the informational entropy of a message (or a storage of bits) is S=N x H where N is the number of bits. For a bulk of mass with physical specific entropy S' (S per kg or S per mole) and there are N kg of it, then its physical entropy is S=N x S'. The S' is specified for a given temperature and pressure. Making T and P constant allows this simple S=N x S' parallel to the simple case of the entropy of a file where S=N x H.
On the microscopic scale where you derive S' for a physical system at a given temperature and pressure, it is not easy to see the connection to Shannon's specific informational entropy H. It is very difficult to see the relation between N symbols in a message and N particles in a gas (or phonons in a solid). The problem is that physical entropy is constrained by N, temperature, pressure, and the "size of the box" in a way we never constrain information. Physical entropy of a systems has more options for a given N, so it comes out higher. To see the connection you would have to calculate the entropy H of a message that has the option of using 0 to n symbols in its choice of messages of 0 to N length, and force a specific relation between the 0 to n and 0 to N (there number of n possible quantum states in N particles depends on U, T, V, and P).
For a given type of material at a T and P entropy is typically of the form:
S entropy of N particles ~ N x ( log(possible states per N) + constant)
which is still not exactly like a file where information S = N x log(2)