Okay so admittedly there were alot of terms flying around and confusing wording, but I will do my best to answer. As far as I could tell you are correct in most of your understading, but there are some points to go over.
It is important to understand how paging and virtual memory work from a hardware context. Paging would prove impractical without hardware support because processes must be agnostic as to how the memory is laid out, and the operating system should not have to use software to babysit every process on the system. That's where the Memory Management Unit (MMU) comes in. This unit basically is programmed by the operating system to arrange pages in a virtual address space and can be controled at will by the operating system. The operating system can tell the unit which pages are actually in physical RAM, and which pages are not loaded yet or are swapped out.
So, how do we keep programs from messing with this memory management stuff? Something we call protection. We can keep processes sandboxed so that they do not interefere with the operating system and other processes. The confusion as to why all of these terms are thrown around together stems from the fact that they are indeed interconnected. The privelages that code have are specified by the page table. The page table tells the MMU how a virtual space is laid out and also tells the MMU whether a page is A) present B) is read/write C) is allowed to execute code and D) what privilege level (ring) the code on said page can execute.
When the scheduler schedules a process, the page table is not recreated, no new memory needs to be arranged, the Operating System simply tells the MMU to use a different page table, which is a O(1) process, or in other words, not dependent on the size of the process or how much memory it uses. Entire processes are rarely swapped in and out of memory at once, usually it is only pages at a time, so the term "swapping" is often clarified as "page swapping."
Okay so with that background, I will attempt to answer each of your questions:
Linear Address Space simply means that you can access things from 0 to 2^32. No need for fancy segmentation as was necessary in the days of 16-bit processors. Virtual memory simply means that the linear address space of a process is defined not by main memory but by the Operating System, this means that the operating system can arrange pages arbitrarily in this address space, placing itself at a high level and the process at a lower level for example. Additionally the processor can specify which parts of this virtual address space are accessable by what privileges. The operating system (kernel) is loaded in every virtual address space so that the processes can do system calls and so that there is someplace to go when they are preempted. They cannot, however, read or write to this area because that is marked by the OS as "privileged code only." They can only accesses it via system call mechanisms in the processor (ie software interrupts). Demand paging simply means that process expects certain parts of this virtual address space to have specific content (perhaps a file, or even parts of itself), but it isnt really there, the OS has marked the area "not present" in the page table. When the process finally does access this area, because it is not present, the CPU throws a fault which is trapped by the OS. The OS then is smart enough to load that page and restart the process where it left off. The result is that the process is not even aware of the hiccup and things are only loaded as they are demanded, saving memory.
Virtual Memory is the name of this entire mechanism of specifying page tables and thier protection, as well as the pages possibly being on another medium like a disk, therefore paging. Virtual Memory is probably the catchall term for your title, excepting perhaps segmentation. When referring to a specific process, I would personally use something like "Virtual address space of a process," since that unambiguosly refers to the virtual memory layout of a specific process.
No. As I mentioned earlier, the OS can arbitrarily map real memory to any location in the virtual address space of a process. That means that it could, for example, have a situation where the process code is at address 0x0, but the heap (growing down) starts at 0xFFFFFFF, clear on the otherside of the address space. There may actually be constraints on where things are mapped due to device drivers needing specific address areas for hardware, but for the purpose of understanding virtual memory, there is no restriction.
Segmentation is simply an addressing scheme. In the 286, it was also used as the mechanism for implementing protection, but that proved to be too inflexible and so in 32-bit processors protection is always done with paging (though as I understand, the 286 protection schemes are retained for when paging is disabled). Since protection is defined by the paging mechanism, segmentation doesnt cause any more or less risk to overwriting data than in a flat memory mode. With most executable file formats, the code segment is clearly separated from the data segment. As we can expect the code to never change, the operating system generally marks the pages of the code segment as read only, thus any attempt to write to code causes a fault and the program exits. This will never occur if all variables and arrays are allocated via the stack or heap in a modern operating system. If, however, the program starts poking around outside of this, it will crash before it is able to overwrite any code. A greater risk (and one which use to be a big problem) is having your stack overwritten in a buffer overrun. Some could take advantage of this to put code on the stack and then cause it to be executed unauthorizedly. As a fix, a new bit was placed in the page table "No eXecute" (NX) bit. This prevents a page from ever being executed.
This is not at all true. The segments simply act as pointers to an area (segment) of the original 2^32 bytes of address space. The idea behind this originally was that it would keep pointers smalle, since you could have a segment pointer and a pointer inside that segment that was smaller than the entire address space. For example, in the 286 (a 16-bit processor) it made sense to keep pointers at 16-bits, yet this presented a problem because the 286 could address 2^24 bytes of memory. The solution? Use segmentation. A segment could be 2^16 bytes large and they could point anywhere in the address space. Then when code had to operate, it would use 16-bit pointers for within that segment only. This was faster and more eifficient. When 32-bit processors came out this mechanism was not longer neccessary, but since it was used so much by code prior and programmers were used to them, they kept segmentation. Newer 64-bit processors do not use segmentation at all.
Confusion here is the fact that virtual memory is the term for many of these different mechanisms. Virtual Memory is required for multitasking for protection of one process from another processes address space. Paging, and by extension preemptive multitasking, is only possible with virtual memory features. Many of these features, however, can be effectively disabled. Perhaps you dont want address translation? Then map every page to itself. Perhaps you dont want memory protection but want address translation? Then give all privileges to every page. In DOS and other single processor systems, a confusion arises when one refers to "protected mode." Usually this refers to 32-bit mode as opposed to 16-bit real mode, so despite the name it does not necessarily mean that protection is used, only that in that mode it can be enabled. There are probably many single process systems that run in this "protected mode" but do not use virtual memory nor protection. The original Xbox is a good example of that. There can be a slight increase in performance when all these features are disabled. However, in DOS it still may be advantageous to use many of these features. The most notable is page swapping, since in the early days when DOS was ubiquitious, RAM was hard to come by and therefore any mechanism that saved on RAM was welcomed and well used. Protection had its advantages in single process systems as well, since it could prevent the program from crashing in an ugly manner, allow for better debugging, and prevented data corruption due to bad hardware access.
I hope this answers your question.