Native Multimodality
Training a single model jointly on multiple modalities from the start, rather than training text-first and bolting on vision later.
Training a single model jointly on multiple modalities from the start, rather than training text-first and bolting on vision later. Gemini's innovation.