This includes some corrections to previous posts I've made.
The connection between physical entropy and information is:
bits = S / ln(2).
This is the minimal number of yes/no questions (the number of bits in a message) that have to be asked to determine what state a physical system is in.
Landauer's principle says at least kT ln(2) energy must be lost as heat when a bit is erased. This is a restatement of the above equation. At least k ln(2) entropy is generated when 1 bit is erased. The ln(2) is the conversion from log base 2 units (bits) to natural log units (nats). Boltzmann's constant k is the conversion from temperature energy units (average RMS energy per particle) to Joules of energy. So it is fundamentally unitless. The apparent units of k is why many are skeptical that physical entropy is the same as information entropy.
Information uses energy and stores potential energy when you store a bit. It is converted to kinetic energy that is released as heat when you erase a bit. Erasing a bit is the same as "computing" (a NAND gate which can be used to make a Turing machine takes two inputs and has one output, erasing 1 of the inputs with each subsequent computation). If reversible computing is done (assuming it has not theoretical problems such as dissipation), then the bit can be go back and forth between potential energy and kinetic energy like a pendulum.
To make entropy equal to an energy, it has to be at a defined temperature, via dQ = T dS. In some sense, temperature is a "noise in the channel" that requires a corresponding increase in energy to transmit or store a given amount of entropy.
There are some complications in connecting information to entropy at the microscopic level. Shannon's informational entropy H is a specific entropy (aka molar entropy) that has units of bits per symbol of a message source. So the informational entropy of a message (or a storage of bits) is S=N x H where N is the number of bits. For a bulk of mass with physical specific entropy S' (S per kg or S per mole) and there are N kg of it, then its physical entropy is S=N x S' and the amount of bits (yes/no questions) needed to specify what state it is in is again S / ln(2). The S' is only valid for a given temperature and pressure.
On the microscopic scale where you derive S' for a physical system at a given temperature and pressure, it is not easy to see the connection to Shannon's specific informational entropy H. It is very difficult to see the relation between N symbols in a message and N particles in a gas (or phonons in a solid). The problem is that physical entropy can use 0 to N gas particles to hold a specified amount of heat energy. And the number of "n" quantum states each particle might be in depends on the box size and the number N particles that are carrying the energy. But when you calculate informational H you are using a defined number of n possible symbols in a message of defined length N. Physical entropy of a systems has more options for a given N, so it comes out higher. To see the connection you would have to calculate the entropy H of a message that has the option of using 0 to n symbols in its choice of messages of 0 to N length, and force a specific relation between the 0 to n and 0 to N. This is as complicated as deriving entropy of a gas in a box, as it should be.
After the derivation, for a given type of material at a T and P you get:
S entropy in a gas of N particles ~ N (log(possible states per N)+constant)
Because of the constant inside the parenthesis, you still can't make N particles of gas equivalent to N symbols in our normal view of bits in a message or file which is S bits = N log2(possible symbols per N). But it does enable the connection I mentioned above, i.e.
bits in physical system = N S' / ln(2)
bits in message or file = N H
Again, the simple yes/no method I gave above to show the connection is exact, so physical entropy and informational entropy are deeply the same thing, adjusted only by a constant i.e. bits = S / ln(2) .