Size review in C/C++
	float:       32-bit (single-precision)
	double:      64-bit (double-precision)
	long double: 80-bit usually, 128-bit on some architectures
	__float128:  128-bit (quad-precision)
		"Q" suffix if defining a constant
		printf doesn't really support these.  quadmath_snprintf will do it
	
Reminder about printf:
	It only prints doubles, even with %f
	cvtss2sd
		There are cvt instructions for a lot of conversions
	Make sure to put a 1 in rax

Using an immediate float value:
	More of a headache than you'd think
		Unless there's an easier way I don't know
	.float or .double, put it in memory somewhere
	movsd (or sf or whatnot) where it needs to go

Calling a C library function from assembly
	Like sin or cos or whatever

How about using floating point and interacting with C?
	Let's finish covering everything for project 4
	Then you can do it whenever

Return value:  xmm0
	Should be easy enough

Alright, scenarios:
	1.  Write a function in C, call it from assembly
	2.  Write a function in assembly, call it from C
	3.  Write a function inside an asm directive

Could we do a vector operation?
	Add float arrays
	vmovaps:  Vector mov Aligned Packed Single
		Can have no vector, unaligned, scaler, double, etc
	vaddps:  Vector add Packed Scaler
		Need to specify operands AND destination
	How big of an array in one step?
		Depends how new and expensive our CPU is
		xmm:  4 floats
		ymm:  8 floats
		zmm:  16 floats
		We can go in batches otherwise
	long URL:  https://www.tomshardware.com/pc-components/cpus/amds-zen-5-avx-512-performance-tested-zen-5-performs-significantly-better-than-zen-4-on-linux-without-consuming-any-more-power
	Could also copy this off to a GPU...

Speaking of the GPU:
	I think we should talk about these a little at some point
	Compiler comes from the GPU manufacturer
		We can't really use assembly
		
Back to x86:
	Would the optimizer use avx?
	-mavx2 flag for gcc might enable it?
		Let's find out!