You can’t change the laws of physics: a long focal length (telephoto) lens has to produce a very small image, or be mounted very far from the sensor plane. Most phone lenses are super-wideange (less than f=20 mm in 35mm SLR equivalent), because that’s the only way to work with the small available height. Similarly, a large sensor gives better dynamic range, but would require the lens to be mounted further away from it, so most phone makers use tiny sensors that produce grainy, nasty images, and then hammer them with software processing until they look good. There is no such thing as a “raw file” from a modern phone: you never see the sensor output.
Optically speaking, iPhones have never had a good camera. Apple learned a long time ago that people don’t care about accurate, as long as the picture looks “good” (this is a fundamental design principle of Apple products, incidentally). These days, there’s a very sophisticated AI processor on phones that turns the distorted, grubby washed-out image from the sensor into a sharply-resolved picture with deeply saturated colours. Much like your own eyes, in fact...