CareerCross uses cookies to enhance your experience on our websites. If you continue to view our sites without changing your browser settings, then it is assumed that we have your consent to collect and utilise your cookies. If you do not want to give us your consent, then please change the cookie settings on your browser. Please refer to our privacy policy for more information.
CareerCross uses cookies to enhance your experience on our websites. If you continue to view our sites without changing your browser settings, then it is assumed that we have your consent to collect and utilise your cookies. If you do not want to give us your consent, then please change the cookie settings on your browser. Please refer to our privacy policy for more information.
| Hiring Company | Unsung Fields Corp. |
| Location | Kanagawa Prefecture, Yokohama-shi Nishi-ku |
| Job Type | Permanent Full-time |
| Salary | 8 million yen ~ 14 million yen |
Inference Systems Engineer (LLM Serving Runtime + Performance)
Role overview
As an inference & serving engineer, your objective is to build a high-performance, multi-tenant serving stack that squeezes maximum utilization out of heterogeneous hardware. This involves navigating the trade-offs between various state-of-the-art inference frameworks and engines, selecting and optimizing the right runtime for the right workload. The scope of work is not limited to Large Language Models; it extends to the frontier of Generative AI, including high-throughput Video generation and complex Multimodal systems where memory pressure and compute requirements are significantly more demanding.
Beyond just deploying models at scale, this role is responsible for building a robust system that bridges the gap between boutique, high-performance clusters and massive, multi-node deployments as the company grows. This requires a deep understanding of the "Inference Triangle"—constantly tuning the stack to find the optimal equilibrium between low-latency (TTFT/ITL), high-throughput, and inference quality (Precision/Quantization). The ideal candidate is a hands-on engineer who views the entire GPU fleet as a single, programmable compute fabric and is eager to get their hands dirty at every level of the stack.
Responsibilities
[Employment Type]
Full-time employee
*Probationary period: 3 months
[Salary]
Annual Salary: ¥8,000,000 - ¥14,000,000
Monthly Salary: ¥666,667 - ¥1,166,667 (Monthly Base Salary: ¥666,667 - ¥1,166,667)
■Salary Increases: Available
[Working Hours]
9:00 AM - 6:00 PM (60-minute break)
[Work Location]
Queen's Tower A, 10th Floor, 2-3-1 Minatomirai, Nishi-ku, Yokohama, Kanagawa Prefecture, 220-6010
■Access: 7-minute walk from Sakuragicho Station (all lines), direct access from Minatomirai Station (Toyoko Line, Minatomirai Line)
■Non-smoking workplace
■Changes to work location: Company-designated offices
■Transfers/Secondments: None
[Holidays and Leave]
120 days off per year Days
Full two-day weekend
Annual paid vacation (minimum 10 days after the seventh month of employment)
[Benefits]
Partial transportation allowance (up to ¥15,000 per month)
Social insurance (health insurance, employee pension insurance, employment insurance, workers' compensation insurance)
Overtime pay: Standard overtime pay
| Minimum Experience Level | Over 6 years |
| Career Level | Mid Career |
| Minimum English Level | Business Level |
| Minimum Japanese Level | None |
| Minimum Education Level | Bachelor's Degree |
| Visa Status | No permission to work in Japan required |
You may be a fit if you have the following skills:
| Job Type | Permanent Full-time |
| Salary | 8 million yen ~ 14 million yen |
| Work Hours | 09:00 - 18:00(60-minute break) |
| Holidays | Two-day weekends,holidays,special leaves,120+ days off annually |
| Industry | Software |
| Company Type | Small/Medium Company (300 employees or less) |