Ignoring performance for but a moment, I'm not sure your initial observation is so. There is no reason to think that a processor could not maintain a register-private key used for decryption. If we used crypto techniques how does this change your rationale? I'm not suggesting this is the way to go, just possible within the design space.
The presence or absence of decryption does not impact the issue of atomicity. The second processor can still observe that the bytes are being transferred.
I think perhaps you took this as a security issue rather than a semantics issue. It is not. When running in the kernel the second processor can do anything the kernel lets it (obviously). When running in user land the second processor can still only observe what its capabilities authorize it to see.
The semantic question is as follows:
If I hold the authority to read your address space by prior grant, and we run on a uniprocessor, the current design will not permit me to observe an IPC to your address space while it is in progress, because the IPC copy is non-preemptable and non-interruptable.
If I am on another processor, however, I *can* observe the copy.
Once we accept that in some implementations the copy is observable, it is worth asking whether we really need the IPC to be non-preemptable. The point of my note is that the answer is "no." The IPC can be semantically well specified without imposing this requirement. Relaxing the requirement should significantly speed the IPC implementation.
Is that more clear? Did I understand you correctly?